Serveur d'exploration SRAS

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Findex: improving search result use through automatic filtering categories

Identifieur interne : 001C98 ( Istex/Corpus ); précédent : 001C97; suivant : 001C99

Findex: improving search result use through automatic filtering categories

Auteurs : Mika K Ki ; Anne Aula

Source :

RBID : ISTEX:F70742F012AD9FF6A3BC4F946E17F85BCBBF22E0

Abstract

Long result lists from web search engines can be tedious to use. We designed a text categorization algorithm and a filtering user interface to address the problem. The Findex system provides an overview of the results by presenting a list of the most frequent words and phrases as result categories next to the actual results. Selecting a category (word or phrase) filters the result list to show only the results containing it. An experiment with 20 participants was conducted to compare the category design to the de facto standard solution (Google-type ranked list interface). Results show that the users were 25% faster and 21% more accurate with our system. In particular, participants’ speed of finding relevant results was 40% higher with the proposed system. Subjective ratings revealed significantly more positive attitudes towards the new system. Results indicate that the proposed design is feasible and beneficial.

Url:
DOI: 10.1016/j.intcom.2005.01.001

Links to Exploration step

ISTEX:F70742F012AD9FF6A3BC4F946E17F85BCBBF22E0

Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title>Findex: improving search result use through automatic filtering categories</title>
<author>
<name sortKey="K Ki, Mika" sort="K Ki, Mika" uniqKey="K Ki M" first="Mika" last="K Ki">Mika K Ki</name>
<affiliation>
<mods:affiliation>Department of Computer Sciences, University of Tampere, Kehruukoulunkatu 1, FIN-33014 Tampere, Finland</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>E-mail: mika.kaki@cs.uta.fi</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>Corresponding author. Tel.: +358 3 215 6181; fax: +358 3 215 6070. E-mail addresses: mika.kaki@cs.uta.fi (M. Käki), anne.aula@cs.uta.fi (A. Aula).</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Aula, Anne" sort="Aula, Anne" uniqKey="Aula A" first="Anne" last="Aula">Anne Aula</name>
<affiliation>
<mods:affiliation>Department of Computer Sciences, University of Tampere, Kehruukoulunkatu 1, FIN-33014 Tampere, Finland</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>Department of Computer Sciences, University of Tampere, Kehruukoulunkatu 1, FIN-33014 Tampere, Finland</mods:affiliation>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:F70742F012AD9FF6A3BC4F946E17F85BCBBF22E0</idno>
<date when="2005" year="2005">2005</date>
<idno type="doi">10.1016/j.intcom.2005.01.001</idno>
<idno type="url">https://api.istex.fr/ark:/67375/HXZ-WSTC58GQ-9/fulltext.pdf</idno>
<idno type="wicri:Area/Istex/Corpus">001C98</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Corpus" wicri:corpus="ISTEX">001C98</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main">Findex: improving search result use through automatic filtering categories</title>
<author>
<name sortKey="K Ki, Mika" sort="K Ki, Mika" uniqKey="K Ki M" first="Mika" last="K Ki">Mika K Ki</name>
<affiliation>
<mods:affiliation>Department of Computer Sciences, University of Tampere, Kehruukoulunkatu 1, FIN-33014 Tampere, Finland</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>E-mail: mika.kaki@cs.uta.fi</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>Corresponding author. Tel.: +358 3 215 6181; fax: +358 3 215 6070. E-mail addresses: mika.kaki@cs.uta.fi (M. Käki), anne.aula@cs.uta.fi (A. Aula).</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Aula, Anne" sort="Aula, Anne" uniqKey="Aula A" first="Anne" last="Aula">Anne Aula</name>
<affiliation>
<mods:affiliation>Department of Computer Sciences, University of Tampere, Kehruukoulunkatu 1, FIN-33014 Tampere, Finland</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>Department of Computer Sciences, University of Tampere, Kehruukoulunkatu 1, FIN-33014 Tampere, Finland</mods:affiliation>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="j" type="main">Interacting with Computers</title>
<idno type="ISSN">0953-5438</idno>
<idno type="eISSN">1873-7951</idno>
<imprint>
<publisher>Oxford University Press</publisher>
<pubPlace>Oxford, UK</pubPlace>
<date type="published">2005</date>
<biblScope unit="vol">17</biblScope>
<biblScope unit="issue">2</biblScope>
<biblScope unit="page" from="187">187</biblScope>
<biblScope unit="page" to="206">206</biblScope>
</imprint>
<idno type="ISSN">0953-5438</idno>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0953-5438</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract">Long result lists from web search engines can be tedious to use. We designed a text categorization algorithm and a filtering user interface to address the problem. The Findex system provides an overview of the results by presenting a list of the most frequent words and phrases as result categories next to the actual results. Selecting a category (word or phrase) filters the result list to show only the results containing it. An experiment with 20 participants was conducted to compare the category design to the de facto standard solution (Google-type ranked list interface). Results show that the users were 25% faster and 21% more accurate with our system. In particular, participants’ speed of finding relevant results was 40% higher with the proposed system. Subjective ratings revealed significantly more positive attitudes towards the new system. Results indicate that the proposed design is feasible and beneficial.</div>
</front>
</TEI>
<istex>
<corpusName>oup</corpusName>
<keywords>
<teeft>
<json:string>user interface</json:string>
<json:string>categorization</json:string>
<json:string>kaki</json:string>
<json:string>relevant result</json:string>
<json:string>aula</json:string>
<json:string>interface</json:string>
<json:string>usability</json:string>
<json:string>user</json:string>
<json:string>dumais</json:string>
<json:string>search result</json:string>
<json:string>search engine</json:string>
<json:string>snippet</json:string>
<json:string>category user interface</json:string>
<json:string>shuttle</json:string>
<json:string>query</json:string>
<json:string>category interface</json:string>
<json:string>nding</json:string>
<json:string>challenger</json:string>
<json:string>reference user interface</json:string>
<json:string>google</json:string>
<json:string>frequent word</json:string>
<json:string>task window</json:string>
<json:string>reference interface</json:string>
<json:string>algorithm</json:string>
<json:string>subjective rating</json:string>
<json:string>result list</json:string>
<json:string>search window</json:string>
<json:string>participant</json:string>
<json:string>space shuttle challenger disaster</json:string>
<json:string>search user interface</json:string>
<json:string>category list</json:string>
<json:string>positive attitude</json:string>
<json:string>super phrase</json:string>
<json:string>categorization technique</json:string>
<json:string>ltering user interface</json:string>
<json:string>relevant category</json:string>
<json:string>roger boisjoly</json:string>
<json:string>challenger disaster</json:string>
<json:string>categorization algorithm</json:string>
<json:string>result view</json:string>
<json:string>corresponding result</json:string>
<json:string>more positive attitude</json:string>
<json:string>search engine user interface</json:string>
<json:string>result show</json:string>
<json:string>word list</json:string>
<json:string>category</json:string>
<json:string>word frequency</json:string>
<json:string>result listing</json:string>
<json:string>much meaning</json:string>
<json:string>information need</json:string>
<json:string>usability evaluation</json:string>
<json:string>language detection</json:string>
<json:string>result item</json:string>
<json:string>document categorization</json:string>
<json:string>information retrieval</json:string>
<json:string>long result list</json:string>
<json:string>rank order</json:string>
<json:string>nding relevant result</json:string>
<json:string>textual information</json:string>
<json:string>digital library</json:string>
<json:string>space shuttle</json:string>
<json:string>multiple category</json:string>
<json:string>search term</json:string>
<json:string>same result</json:string>
<json:string>result categorization</json:string>
<json:string>more accurate</json:string>
<json:string>actual result</json:string>
<json:string>information access</json:string>
<json:string>radio button</json:string>
<json:string>pilot test</json:string>
<json:string>second question</json:string>
<json:string>independent variable</json:string>
<json:string>task description</json:string>
<json:string>task execution</json:string>
<json:string>total time</json:string>
<json:string>real situation</json:string>
<json:string>time limit</json:string>
<json:string>task time</json:string>
<json:string>planet jupiter</json:string>
<json:string>average time</json:string>
<json:string>second relevant result</json:string>
<json:string>subjective measure</json:string>
<json:string>point scale</json:string>
<json:string>median score</json:string>
<json:string>wilcoxon test</json:string>
<json:string>good reason</json:string>
<json:string>retrieval result</json:string>
<json:string>short text summary</json:string>
</teeft>
</keywords>
<author>
<json:item>
<name>Mika Käki</name>
<affiliations>
<json:string>Department of Computer Sciences, University of Tampere, Kehruukoulunkatu 1, FIN-33014 Tampere, Finland</json:string>
<json:string>E-mail: mika.kaki@cs.uta.fi</json:string>
<json:string>Corresponding author. Tel.: +358 3 215 6181; fax: +358 3 215 6070. E-mail addresses: mika.kaki@cs.uta.fi (M. Käki), anne.aula@cs.uta.fi (A. Aula).</json:string>
</affiliations>
</json:item>
<json:item>
<name>Anne Aula</name>
<affiliations>
<json:string>Department of Computer Sciences, University of Tampere, Kehruukoulunkatu 1, FIN-33014 Tampere, Finland</json:string>
<json:string>Department of Computer Sciences, University of Tampere, Kehruukoulunkatu 1, FIN-33014 Tampere, Finland</json:string>
</affiliations>
</json:item>
</author>
<subject>
<json:item>
<value>Web search</value>
</json:item>
<json:item>
<value>Search user interface</value>
</json:item>
<json:item>
<value>Categorization</value>
</json:item>
<json:item>
<value>Clustering</value>
</json:item>
<json:item>
<value>Information access</value>
</json:item>
</subject>
<arkIstex>ark:/67375/HXZ-WSTC58GQ-9</arkIstex>
<language>
<json:string>unknown</json:string>
</language>
<originalGenre>
<json:string>research-article</json:string>
</originalGenre>
<abstract>Long result lists from web search engines can be tedious to use. We designed a text categorization algorithm and a filtering user interface to address the problem. The Findex system provides an overview of the results by presenting a list of the most frequent words and phrases as result categories next to the actual results. Selecting a category (word or phrase) filters the result list to show only the results containing it. An experiment with 20 participants was conducted to compare the category design to the de facto standard solution (Google-type ranked list interface). Results show that the users were 25% faster and 21% more accurate with our system. In particular, participants’ speed of finding relevant results was 40% higher with the proposed system. Subjective ratings revealed significantly more positive attitudes towards the new system. Results indicate that the proposed design is feasible and beneficial.</abstract>
<qualityIndicators>
<score>6.74</score>
<pdfWordCount>6889</pdfWordCount>
<pdfCharCount>40331</pdfCharCount>
<pdfVersion>1.3</pdfVersion>
<pdfPageCount>20</pdfPageCount>
<pdfPageSize>468 x 680 pts</pdfPageSize>
<pdfWordsPerPage>344</pdfWordsPerPage>
<pdfText>true</pdfText>
<refBibsNative>true</refBibsNative>
<abstractWordCount>145</abstractWordCount>
<abstractCharCount>926</abstractCharCount>
<keywordCount>5</keywordCount>
</qualityIndicators>
<title>Findex: improving search result use through automatic filtering categories</title>
<genre>
<json:string>research-article</json:string>
</genre>
<host>
<title>Interacting with Computers</title>
<language>
<json:string>unknown</json:string>
</language>
<issn>
<json:string>0953-5438</json:string>
</issn>
<eissn>
<json:string>1873-7951</json:string>
</eissn>
<publisherId>
<json:string>iwc</json:string>
</publisherId>
<volume>17</volume>
<issue>2</issue>
<pages>
<first>187</first>
<last>206</last>
</pages>
<genre>
<json:string>journal</json:string>
</genre>
</host>
<namedEntities>
<unitex>
<date>
<json:string>2005</json:string>
</date>
<geogName></geogName>
<orgName>
<json:string>Jupiter Research Company</json:string>
<json:string>Online Ethics Center</json:string>
<json:string>Finland Received</json:string>
<json:string>Graduate School</json:string>
</orgName>
<orgName_funder>
<json:string>Graduate School</json:string>
</orgName_funder>
<orgName_provider></orgName_provider>
<persName>
<json:string>Johanna Hoysniemi</json:string>
<json:string>Studies</json:string>
<json:string>Anne Aula</json:string>
<json:string>Tomi Heimonen</json:string>
<json:string>Results</json:string>
<json:string>Harri Siirtola</json:string>
<json:string>Martin Porter</json:string>
<json:string>Poika Isokoski</json:string>
<json:string>The</json:string>
<json:string>Natalie Jhaveri</json:string>
<json:string>A. Aula</json:string>
<json:string>Roger Boisjoly</json:string>
<json:string>M. Kaki</json:string>
</persName>
<placeName>
<json:string>Java</json:string>
<json:string>New Zealand</json:string>
</placeName>
<ref_url>
<json:string>http://www.google.com</json:string>
<json:string>http://www.vivisimo.com</json:string>
</ref_url>
<ref_bibl>
<json:string>Dumais et al. (2001)</json:string>
<json:string>Zamir and Etzioni, 1998</json:string>
<json:string>picture after Cho and Myaeng, 2000</json:string>
<json:string>Chen et al., 1999</json:string>
<json:string>Chekuri et al. (1997)</json:string>
<json:string>Hearst and Pedersen, 1996</json:string>
<json:string>Wittenburg and Sigman, 1997</json:string>
<json:string>Zamir and Etzioni (1998)</json:string>
<json:string>picture after Pratt and Fagan, 2000</json:string>
<json:string>Cho and Myaeng, 2000</json:string>
<json:string>Jones and Paynter, 1999</json:string>
<json:string>Dumais et al. (1988)</json:string>
<json:string>Dumais and Chen, 2001</json:string>
<json:string>Pirolli et al., 1996</json:string>
<json:string>Jansen et al., 1998</json:string>
<json:string>Spink et al., 2001</json:string>
<json:string>Dumais et al., 2001</json:string>
<json:string>Cutting et al., 1992</json:string>
<json:string>Maarek et al., 2000</json:string>
<json:string>Jones et al., 2004</json:string>
<json:string>Pratt and Fagan, 2000</json:string>
<json:string>Chen and Dumais (2000)</json:string>
<json:string>Zamir and Etzioni, 1999</json:string>
<json:string>picture after Zamir and Etzioni, 1998</json:string>
<json:string>Kaki, 2004</json:string>
<json:string>Chen and Dumais, 2000</json:string>
</ref_bibl>
<bibl></bibl>
</unitex>
</namedEntities>
<ark>
<json:string>ark:/67375/HXZ-WSTC58GQ-9</json:string>
</ark>
<categories>
<wos>
<json:string>1 - social science</json:string>
<json:string>2 - ergonomics</json:string>
<json:string>1 - science</json:string>
<json:string>2 - computer science, cybernetics</json:string>
</wos>
<scienceMetrix>
<json:string>1 - health sciences</json:string>
<json:string>2 - psychology & cognitive sciences</json:string>
<json:string>3 - human factors</json:string>
</scienceMetrix>
<scopus>
<json:string>1 - Physical Sciences</json:string>
<json:string>2 - Computer Science</json:string>
<json:string>3 - Human-Computer Interaction</json:string>
<json:string>1 - Physical Sciences</json:string>
<json:string>2 - Computer Science</json:string>
<json:string>3 - Software</json:string>
</scopus>
<inist>
<json:string>1 - sciences humaines et sociales</json:string>
</inist>
</categories>
<publicationDate>2005</publicationDate>
<copyrightDate>2005</copyrightDate>
<doi>
<json:string>10.1016/j.intcom.2005.01.001</json:string>
</doi>
<id>F70742F012AD9FF6A3BC4F946E17F85BCBBF22E0</id>
<score>1</score>
<fulltext>
<json:item>
<extension>pdf</extension>
<original>true</original>
<mimetype>application/pdf</mimetype>
<uri>https://api.istex.fr/ark:/67375/HXZ-WSTC58GQ-9/fulltext.pdf</uri>
</json:item>
<json:item>
<extension>zip</extension>
<original>false</original>
<mimetype>application/zip</mimetype>
<uri>https://api.istex.fr/ark:/67375/HXZ-WSTC58GQ-9/bundle.zip</uri>
</json:item>
<istex:fulltextTEI uri="https://api.istex.fr/ark:/67375/HXZ-WSTC58GQ-9/fulltext.tei">
<teiHeader>
<fileDesc>
<titleStmt>
<title level="a" type="main">Findex: improving search result use through automatic filtering categories</title>
</titleStmt>
<publicationStmt>
<authority>ISTEX</authority>
<publisher>Oxford University Press</publisher>
<pubPlace>Oxford, UK</pubPlace>
<availability>
<licence>© 2005 Elsevier B.V. All rights reserved.</licence>
</availability>
<date type="Copyright" when="2005">2005</date>
<date type="published">2005</date>
</publicationStmt>
<notesStmt>
<note type="content-type" source="research-article" scheme="https://content-type.data.istex.fr/ark:/67375/XTP-1JC4F85T-7">research-article</note>
<note type="publication-type" scheme="https://publication-type.data.istex.fr/ark:/67375/JMC-0GLKJH51-B">journal</note>
</notesStmt>
<sourceDesc>
<biblStruct type="article">
<analytic>
<title level="a" type="main">Findex: improving search result use through automatic filtering categories</title>
<author xml:id="author-0000">
<persName>
<surname>Käki</surname>
<forename type="first">Mika</forename>
</persName>
</author>
<author xml:id="author-0001">
<persName>
<surname>Aula</surname>
<forename type="first">Anne</forename>
</persName>
</author>
<idno type="istex">F70742F012AD9FF6A3BC4F946E17F85BCBBF22E0</idno>
<idno type="ark">ark:/67375/HXZ-WSTC58GQ-9</idno>
<idno type="DOI">10.1016/j.intcom.2005.01.001</idno>
</analytic>
<monogr>
<title level="j" type="main">Interacting with Computers</title>
<idno type="hwp">iwc</idno>
<idno type="publisher-id">iwc</idno>
<idno type="pISSN">0953-5438</idno>
<idno type="eISSN">1873-7951</idno>
<imprint>
<publisher>Oxford University Press</publisher>
<pubPlace>Oxford, UK</pubPlace>
<date type="published">2005</date>
<biblScope unit="vol">17</biblScope>
<biblScope unit="issue">2</biblScope>
<biblScope unit="page" from="187">187</biblScope>
<biblScope unit="page" to="206">206</biblScope>
</imprint>
</monogr>
</biblStruct>
</sourceDesc>
</fileDesc>
<encodingDesc>
<schemaRef type="ODD" url="https://xml-schema.delivery.istex.fr/tei-istex.odd"></schemaRef>
<appInfo>
<application ident="pub2tei" version="1.0.41" when="2020-04-06">
<label>pub2TEI-ISTEX</label>
<desc>A set of style sheets for converting XML documents encoded in various scientific publisher formats into a common TEI format.
<ref target="http://www.tei-c.org/">We use TEI</ref>
</desc>
</application>
</appInfo>
</encodingDesc>
<profileDesc>
<abstract>
<head>Abstract</head>
<p>Long result lists from web search engines can be tedious to use. We designed a text categorization algorithm and a filtering user interface to address the problem. The Findex system provides an overview of the results by presenting a list of the most frequent words and phrases as result categories next to the actual results. Selecting a category (word or phrase) filters the result list to show only the results containing it. An experiment with 20 participants was conducted to compare the category design to the de facto standard solution (Google-type ranked list interface). Results show that the users were 25% faster and 21% more accurate with our system. In particular, participants’ speed of finding relevant results was 40% higher with the proposed system. Subjective ratings revealed significantly more positive attitudes towards the new system. Results indicate that the proposed design is feasible and beneficial.</p>
</abstract>
<textClass ana="subject">
<keywords scheme="heading">
<term>Articles</term>
</keywords>
</textClass>
<textClass ana="keyword">
<keywords>
<term>Web search</term>
<term>Search user interface</term>
<term>Categorization</term>
<term>Clustering</term>
<term>Information access</term>
</keywords>
</textClass>
<langUsage>
<language ident="EN"></language>
</langUsage>
</profileDesc>
<revisionDesc>
<change when="2020-04-06" who="#istex" xml:id="pub2tei">formatting</change>
</revisionDesc>
</teiHeader>
</istex:fulltextTEI>
<json:item>
<extension>txt</extension>
<original>false</original>
<mimetype>text/plain</mimetype>
<uri>https://api.istex.fr/ark:/67375/HXZ-WSTC58GQ-9/fulltext.txt</uri>
</json:item>
</fulltext>
<metadata>
<istex:metadataXml wicri:clean="corpus oup, element #text not found" wicri:toSee="no header">
<istex:xmlDeclaration>version="1.0" encoding="utf-8"</istex:xmlDeclaration>
<istex:docType PUBLIC="-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" URI="journalpublishing.dtd" name="istex:docType"></istex:docType>
<istex:document>
<article article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="hwp">iwc</journal-id>
<journal-id journal-id-type="publisher-id">iwc</journal-id>
<journal-title>Interacting with Computers</journal-title>
<issn pub-type="ppub">0953-5438</issn>
<issn pub-type="epub">1873-7951</issn>
<publisher>
<publisher-name>Oxford University Press</publisher-name>
<publisher-loc>Oxford, UK</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.1016/j.intcom.2005.01.001</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Articles</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Findex: improving search result use through automatic filtering categories</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Käki</surname>
<given-names>Mika</given-names>
</name>
<xref ref-type="corresp" rid="cor1">*</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Aula</surname>
<given-names>Anne</given-names>
</name>
<xref ref-type="fn" rid="fn1">1</xref>
</contrib>
<aff>Department of Computer Sciences, University of Tampere, Kehruukoulunkatu 1, FIN-33014 Tampere, Finland</aff>
</contrib-group>
<author-notes>
<corresp id="cor1">
<label>*</label>
Corresponding author. Tel.: +358 3 215 6181; fax: +358 3 215 6070.
<italic>E-mail addresses:</italic>
<email>mika.kaki@cs.uta.fi</email>
(M. Käki),
<email>anne.aula@cs.uta.fi</email>
(A. Aula).</corresp>
</author-notes>
<pub-date pub-type="ppub">
<month>3</month>
<year>2005</year>
</pub-date>
<volume>17</volume>
<issue>2</issue>
<fpage>187</fpage>
<lpage>206</lpage>
<history>
<date date-type="received">
<day>24</day>
<month>6</month>
<year>2004</year>
</date>
<date date-type="rev-recd">
<day>6</day>
<month>8</month>
<year>2004</year>
</date>
<date date-type="accepted">
<day>10</day>
<month>1</month>
<year>2005</year>
</date>
</history>
<copyright-statement>© 2005 Elsevier B.V. All rights reserved.</copyright-statement>
<copyright-year>2005</copyright-year>
<abstract>
<title>Abstract</title>
<p>Long result lists from web search engines can be tedious to use. We designed a text categorization algorithm and a filtering user interface to address the problem. The Findex system provides an overview of the results by presenting a list of the most frequent words and phrases as result categories next to the actual results. Selecting a category (word or phrase) filters the result list to show only the results containing it. An experiment with 20 participants was conducted to compare the category design to the de facto standard solution (Google-type ranked list interface). Results show that the users were 25% faster and 21% more accurate with our system. In particular, participants’ speed of finding relevant results was 40% higher with the proposed system. Subjective ratings revealed significantly more positive attitudes towards the new system. Results indicate that the proposed design is feasible and beneficial.</p>
</abstract>
<kwd-group>
<title>Keywords</title>
<kwd>Web search</kwd>
<kwd>Search user interface</kwd>
<kwd>Categorization</kwd>
<kwd>Clustering</kwd>
<kwd>Information access</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec>
<label>1</label>
<title>Introduction</title>
<p>Web search engines are one of the most popular means of finding information from the World Wide Web. The huge amount of documents requires the users to describe their information need very precisely in order to avoid too long result lists. Indeed, formulating the information need accurately in the search query is known to be hard for typical web users. Studies show that people enter very few search terms, typically one or two (
<xref ref-type="bibr" rid="bib10">Jansen et al., 1998</xref>
). Such queries result in huge result sets which are hard to understand and slow to browse through.</p>
<p>Very long result lists are clearly a major usability problem and a challenge for the search engine user interfaces. To solve this problem, information scientists look for better retrieval algorithms to get better results in the first place and human–computer interaction practitioners work for improved user interfaces for result handling. We follow the latter path and state the first question of the study: how to present the search results so that users are able to find the needed information efficiently?</p>
<p>We propose a new filtering user interface based on automatic result categories for accessing the results. The system is called Findex. The user interface presents an overview of the results to help the users identify and access the interesting results quickly. The overview is constructed by computing the most frequent words and phrases in the results and presenting them to the user as a list of categories next to the result list. The user can then select an interesting category from the list and the user interface filters the result list to show the corresponding search results.</p>
<p>In order to test the feasibility and the usefulness of the design, we implemented it and conducted an experiment with 20 participants. The experiment compared our solution to a currently widely accepted search user interface model (ranked list). The experiment aimed to answer the second question of this study: can users understand the user interface and is it beneficial in accessing the search results? The results show that the new user interface was faster and more accurate compared to the conventional one and users expressed positive attitudes towards it.</p>
<p>In the following, we will take a brief look of related work in the field. After that, the algorithm and the user interface of the proposed system are thoroughly described followed by a description of the experiment and its results. Finally, findings are summed up in the conclusions.</p>
</sec>
<sec>
<label>2</label>
<title>Related work</title>
<p>Three areas of research are relevant for this study. Firstly, research in categorization of textual documents (web or otherwise) set the background for our categorization mechanism. Secondly, work in understanding and improving web search user interfaces provides us with information on search interface usability. Thirdly, studies on usability evaluation of web search engines give examples of how to study this phenomenon. We concentrate on papers that cover several of these areas as they are the closest references.</p>
<p>Document categorization has long traditions in the information retrieval (IR) community. The incompleteness and impreciseness of simple lexical text matching methods have been identified and document categorization is regarded as one option to overcome these difficulties. We can identify two commonly used categorization techniques: document classification means putting documents into predefined categories whereas document clustering refers to dividing a set of documents into groups based on their similarity (
<xref ref-type="bibr" rid="bib13">Maarek et al., 2000</xref>
). Various implementation techniques have been proposed for both.
<fig id="fig1" position="float">
<label>Fig. 1</label>
<caption>
<p>Categorizing web search user interface: Grouper (picture after
<xref ref-type="bibr" rid="bib18">Zamir and Etzioni, 1998</xref>
).</p>
</caption>
<graphic xlink:href="iwc17-0187-f1.tif"></graphic>
</fig>
</p>
<p>
<xref ref-type="bibr" rid="bib7">Dumais et al. (1988)</xref>
were among the first in the HCI community to suggest the use of clustering techniques (in their case based on Latent Semantic Indexing, LSI) for improving the access to textual information. Later, Scatter/Gather (
<xref ref-type="bibr" rid="bib5">Cutting et al., 1992</xref>
) was one of the first real systems where the clustering approach was usability tested. The first results were not promising as the clustering interface seemed both slower and less precise when searching for relevant articles on a given topic (
<xref ref-type="bibr" rid="bib14">Pirolli et al., 1996</xref>
). However, in a follow-up study, Scatter/Gather was found to have potential because users were able to identify and use the most relevant categories in information gathering tasks (
<xref ref-type="bibr" rid="bib9">Hearst and Pedersen, 1996</xref>
).</p>
<p>Later on,
<xref ref-type="bibr" rid="bib18">Zamir and Etzioni (1998)</xref>
demonstrated the technical feasibility of clustering techniques in web environment using their own algorithm. They have also proposed a search engine user interface—Grouper (
<xref ref-type="bibr" rid="bib19">Zamir and Etzioni, 1999</xref>
) (
<xref ref-type="fig" rid="fig1">Fig. 1</xref>
)—based on the clustering algorithm. The interface presents the titles of sample result pages grouped in clusters and resembles the user interface of Scatter/Gather. In addition, Grouper lets the user refine the query by selecting keywords from a category. Unfortunately, controlled usability tests on Grouper have not been published, but evaluation has been based on log studies and measuring the properties of the algorithm.</p>
<p>The DART (
<xref ref-type="bibr" rid="bib4">Cho and Myaeng, 2000</xref>
) system provides another type of user interface for a clustering system. DART uses the same clustering algorithm as Grouper. The contribution of DART is a dartboard-like user interface (
<xref ref-type="fig" rid="fig2">Fig. 2</xref>
) which visualizes the result set in relation to the clusters. The system has been usability tested, but the results are hard to interpret. For example, the number of participants was not reported.
<fig id="fig2" position="float">
<label>Fig. 2</label>
<caption>
<p>User interface of DART (picture after
<xref ref-type="bibr" rid="bib4">Cho and Myaeng, 2000</xref>
).</p>
</caption>
<graphic xlink:href="iwc17-0187-f2.tif"></graphic>
</fig>
</p>
<p>One problem with clustering is the naming of the clusters. Typically, the most frequent or most distinctive word(s) found in the documents of the cluster are used as the name. This, however, may lead to rather long, uninformative, and incomprehensible names. A solution to this problem is to use document classification instead. When document categories are predefined, the names can also be predefined to correctly express the intended meaning.</p>
<p>
<xref ref-type="bibr" rid="bib1">Chekuri et al. (1997)</xref>
used this classification approach in a document classifier for web searches. They envisioned that the user could choose one or multiple categories along with the search terms when submitting a query. Unfortunately neither the user interface nor a usability evaluation are reported for this solution.</p>
<p>A similar solution has been in the focus of recent SWISH prototype (
<xref ref-type="bibr" rid="bib2 bib6 bib8">Chen and Dumais, 2000; Dumais and Chen, 2001; Dumais et al., 2001</xref>
). SWISH has a classifier for web search engine results and a special user interface for it (
<xref ref-type="fig" rid="fig3">Fig. 3</xref>
). The proposed solutions have also been thoroughly evaluated (both algorithmically and for usability). The user studies conclude that categories are indeed faster and more efficient for a particular type of tasks. The studies compared multiple user interface solutions and found that categories are most effective when presented with some sample results. The examples seem to help the users to understand the meaning of a category.</p>
<p>In addition to text-based clustering and classification, other categorization methods have also been identified. For instance, Cha–Cha (
<xref ref-type="bibr" rid="bib3">Chen et al., 1999</xref>
) and AMIT (
<xref ref-type="bibr" rid="bib17">Wittenburg and Sigman, 1997</xref>
) use hypertext link structure as the basis for the categorization of the documents. The usability of Cha–Cha has been under investigation and answers to a questionnaire showed positive attitudes towards the system. However, without more objective measures, the results are rather speculative. Another example of a different categorization scheme is DynaCat (
<xref ref-type="bibr" rid="bib15">Pratt and Fagan, 2000</xref>
), which uses domain knowledge (man-made taxonomies) in its classification process. DynaCat has an overview-based user interface (
<xref ref-type="fig" rid="fig4">Fig. 4</xref>
) and a user test showed positive results about its usefulness.
<fig id="fig3" position="float">
<label>Fig. 3</label>
<caption>
<p>Result classifying SWISH web search interface (picture after
<xref ref-type="bibr" rid="bib8">Dumais et al., 2001</xref>
).</p>
</caption>
<graphic xlink:href="iwc17-0187-f3.tif"></graphic>
</fig>
</p>
<p>The digital library project in New Zealand adopted yet another way of providing summaries of textual material. The project produced multiple examples of user interfaces that are based on key phrase extraction. The technique has been used to find related documents (
<xref ref-type="bibr" rid="bib11">Jones and Paynter, 1999</xref>
) as well as to categorize search results on small devices (
<xref ref-type="bibr" rid="bib12">Jones et al., 2004</xref>
).</p>
<p>A commercial product called Vivísimo (
<ext-link ext-link-type="uri" xlink:href="http://www.vivisimo.com">http://www.vivisimo.com</ext-link>
) uses categorization, but neither a description of the categorization algorithm nor usability test results have been published. The user interface of Vivísimo resembles closely that of ours as it initially displays 10 categories besides the actual results. The biggest difference is that Vivísimo utilizes hierarchical categorization scheme whereas ours is based on a simpler list. In the Vivísimo user interface, the initial top-level categories can be further explored by looking at sub categories of them. The actual categorization scheme of Vivísimo is unknown, but seems to utilize frequently occurring words, words that occur frequently together in the same result, and frequently occurring word stings (phrases). The hierarchy is achieved assumingly by applying the same categorization scheme recursively to the top-level categories.
<fig id="fig4" position="float">
<label>Fig. 4</label>
<caption>
<p>DynaCat system that uses domain knowledge for result categorization (picture after
<xref ref-type="bibr" rid="bib15">Pratt and Fagan, 2000</xref>
).</p>
</caption>
<graphic xlink:href="iwc17-0187-f4.tif"></graphic>
</fig>
</p>
<p>In summary, the problems in the past studies and systems that are relevant here are:
<list list-type="bullet">
<list-item>
<p>Lack of thorough user experiments. We have only limited knowledge on the usability of result categorization systems.</p>
</list-item>
<list-item>
<p>Utilization of too complex categorization techniques that are arguably not understood by the users.</p>
</list-item>
</list>
</p>
<p>Our work aims to solve these problems.</p>
</sec>
<sec>
<label>3</label>
<title>System description</title>
<p>Technically our solution is something in between the Grouper and the user interface developed by
<xref ref-type="bibr" rid="bib2">Chen and Dumais (2000)</xref>
. It does not use predefined categories like Chen and Dumais’ system, but it does not use classical clustering techniques, either. Instead we simply seek for the most frequent words or phrases among the results and use them as the categories. The categories are shown in a separate list beside the results. Selecting a category displays the corresponding results in the result list, that is, filters the result set. The actual searches are done through Google Web API (
<ext-link ext-link-type="uri" xlink:href="http://www.google.com">http://www.google.com</ext-link>
). We shall first describe the categorization algorithm.</p>
<sec>
<label>3.1</label>
<title>Result categorization</title>
<p>One of the restrictions in the web environment is that the whole document text body is not available for the categorization process. As others have demonstrated (
<xref ref-type="bibr" rid="bib8 bib18">Dumais et al., 2001; Zamir and Etzioni, 1998</xref>
), clustering and classification methods can be used to categorize web search results based solely on the short text summaries (snippets) returned by the search engine. However, we believe that the naming problem associated with clustering, the limitations of classification, and the complexity of both could be avoided with a different approach. In order to make the system understandable and users to feel in control, a simpler solution is desirable.</p>
<p>Our categorization is solely based on word frequencies in the result listing, i.e. in titles and short text summaries (snippets). We basically select the
<italic>n</italic>
most frequent words and use them as the categories. Such a category contains then all the results where the word appears. It is commonly known that simply selecting the most frequent words does not work, because articles and other very frequent words (like ‘and’) do not carry much meaning on their own. We use a stop word list to exclude such words from the category list.</p>
<p>The second problem in simple lexical word matching is that simple inflections of the words make them different (e.g. ‘car’ and ‘cars’ would be two different words). In order to reduce this problem, we use a word stemmer (Snowball stemmer by Martin Porter,
<ext-link ext-link-type="uri" xlink:href="http://snowball.tartarus.org/">http://snowball.tartarus.org/</ext-link>
). The stemmer removes the word endings so that the simple inflections of a word map to the same word stem.</p>
<p>Both the previous techniques are language-sensitive. Our software has been built in a way that enables us to easily add more languages as desired. For a new language, we need: (1) a stemmer, (2) a word frequency list of the language (corpus), and (3) a stop word list. The corpus is used for language detection and it could also be used to approximate the stop word list automatically, but a human-made list is preferred for better accuracy. For the testing purposes we have implemented the needed functions for English and Finnish. Language detection of the results is automatic and is based on word frequencies in the corpora.</p>
<p>With this simple logic, we get a list of words capturing the major topics of the results fairly well. However, some words acquire considerably more meaning when presented in context. For example, the word ‘states’ does not convey as much meaning as the phrase ‘united states’. To present the user with more meaningful categories we search for the most frequent phrases in the results as well. A phrase is defined to be any string of words inside a sentence (between periods).</p>
<p>Categories are computed right after the search engine has returned the results. Each word in the results, except the stop words, is stemmed and stored with information on the result item that contained the word. For the phrases the procedure is similar. As each sentence is broken into phrases, each word comprising a phrase is stemmed and the resulting stemmed phrase is stored with information about which result contains it.</p>
<p>As the category candidates are processed, unique words and phrases and so called sub phrases are removed. Sub phrases are part of a longer phrase (super phrase). For example, if we have a super phrase ‘united states’, there will be corresponding sub phrases ‘united’ and ‘states’ among the candidates. All sub phrases, which are part of a super phrase in the same result, are removed. Candidates are sorted according to the frequency, and
<italic>n</italic>
(currently 15) first candidates are selected as the categories. We are currently preparing another paper describing the algorithm in more detail. Some examples of the resulting categories for a few queries can be seen in
<xref ref-type="table" rid="tbl1">Table 1</xref>
.
<table-wrap id="tbl1" position="float">
<label>Table 1</label>
<caption>
<p>Categories calculated for a few popular queries</p>
</caption>
<table>
<thead>
<tr>
<th align="left">Query</th>
<th align="left">Challenger</th>
<th align="left">Sars</th>
<th align="left">Jaguar</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" rowspan="14">Categories</td>
<td align="left">Space shuttle challenger</td>
<td align="left">Health</td>
<td align="left">Club</td>
</tr>
<tr>
<td align="left">Challenger disaster</td>
<td align="left">Information</td>
<td align="left">Jaguar cars</td>
</tr>
<tr>
<td align="left">Mission</td>
<td align="left">Global</td>
<td align="left">Information</td>
</tr>
<tr>
<td align="left">Challenger learning center</td>
<td align="left">China</td>
<td align="left">Jaguar panthera onca</td>
</tr>
<tr>
<td align="left">January</td>
<td align="left">Outbreak</td>
<td align="left">Atari jaguar</td>
</tr>
<tr>
<td align="left">Crew</td>
<td align="left">Public</td>
<td align="left">Mac jaguar</td>
</tr>
<tr>
<td align="left">Nasa</td>
<td align="left">World</td>
<td align="left">Reviews</td>
</tr>
<tr>
<td align="left">Tragedy</td>
<td align="left">Cdc</td>
<td align="left">Performance</td>
</tr>
<tr>
<td align="left">Challenger accident</td>
<td align="left">Latest</td>
<td align="left">Wildlife</td>
</tr>
<tr>
<td align="left">Information</td>
<td align="left">Sars virus</td>
<td align="left">Virtual</td>
</tr>
<tr>
<td align="left">Science</td>
<td align="left">Asia</td>
<td align="left">Largest cat</td>
</tr>
<tr>
<td align="left">Reagan</td>
<td align="left">April</td>
<td align="left">First</td>
</tr>
<tr>
<td align="left">History</td>
<td align="left">Diseases</td>
<td align="left">Apple</td>
</tr>
<tr>
<td align="left">Description</td>
<td align="left">Sars epidemic</td>
<td align="left">Powerful</td>
</tr>
<tr>
<td align="left">Dodge challenger</td>
<td align="left">Government</td>
<td align="left">Homepage</td>
</tr>
</tbody>
</table>
</table-wrap>
</p>
</sec>
<sec>
<label>3.2</label>
<title>Properties of the categorization technique</title>
<p>The calculation of categories is computationally intensive if the number of candidate phrases is large. The dominant factor is the number of results to be categorized. By experimentation we found that 150 first results seem to capture the most frequent categories. Increasing the number of results beyond that makes practically no difference in the categories. The current implementation needs about 2 seconds to form categories for these 150 search results making it a feasible solution. Part of the calculation could be made in parallel while waiting for the search results.</p>
<p>As the categories are computed from the first 150 results returned by a search engine, the underlying ranking method has a considerable effect on the outcome. It determines (1) which results are categorized (150 first ones), (2) what is the order of results within the calculated categories (rank order), and (3) which categories are selected in a tie situation (category with higher ranks in the original result list).</p>
<p>There are a few properties of the frequency-based categorization technique that require further attention. First, the selected categories are not exclusive meaning that one result can sometimes be accessed through multiple categories. We consider such an overlap in the categories to be a desirable feature, because the meaning of the information depends on the context and the overlapping categorization may help users to realize some of those meanings. For example, let us consider the ‘challenger’ query in
<xref ref-type="table" rid="tbl1">Table 1</xref>
. The following two results seem to discuss roughly the same topic (space shuttle Challenger disaster).</p>
<p>
<list list-type="simple">
<list-item>
<p>
<italic>Online Ethics Center: Roger Boisjoly and the Challenger Disaster</italic>
</p>
</list-item>
<list-item>
<p>The space shuttle Challenger disaster recounted by Roger Boisjoly who attempted to get the mission cancelled.… Roger Boisjoly and the Challenger Disaster.…</p>
</list-item>
</list>
</p>
<p>
<list list-type="simple">
<list-item>
<p>
<italic>The Space Shuttle Challenger Disaster—a NASA Tragedy</italic>
</p>
</list-item>
<list-item>
<p>The Space Shuttle Challenger Disaster, a NASA Tragedy. When the space shuttle… Related Resources to Space Shuttle Challenger Disaster.…</p>
</list-item>
</list>
</p>
<p>The categorization algorithm places both of these results into two categories: ‘challenger disaster’ and ‘space shuttle challenger’. For users looking for information about the accident the former is more relevant while the latter one will attract those generally interested in the space shuttle. The listed results are relevant for both.</p>
<p>The second feature of the technique is that not all results are guaranteed to belong to any category. This could be very undesirable as some relevant results could not be accessed at all. We provide a special built-in category for viewing all the results in the normal rank order list to overcome this problem. According to our experience, this solution works fine.</p>
<p>The third property of the technique is that it may produce categories that are out of context or hard to understand for the user. Because the technique is based solely on statistical analysis, it does not consider the words as concepts having a well-defined meaning. As a result, the usefulness of the categories is not guaranteed and it can vary between different result sets. We believe that users will understand this and that they are able to discard the possible low-quality categories.</p>
</sec>
<sec>
<label>3.3</label>
<title>User interface</title>
<p>The current prototype is implemented in Java as a standalone application. This approach enabled us to easily experiment with the user interface mechanics and made implementation of the experiment easy and robust. However, we have also implemented the same functionality as a standard web service to be used through a standard web browser. This implementation makes the solution attractive to a wider user population and is seen to be feasible.</p>
<p>The user interface follows the basic idea used in most graphical email clients and in Windows Explorer. These programs display a set of collections on the left of the window while the right side shows the contents of the selected collection (like files in a folder). The same holds true here as well: the left side of the user interface shows the list of categories (words and phrases) and the right side displays the corresponding results (
<xref ref-type="fig" rid="fig5">Fig. 5</xref>
). This familiar design was assumed to be easy to understand and adopt.</p>
<p>The category list and the result display are tightly coupled in the user interface so that changing the category selection immediately changes the contents of the result view. In the result view, the selected keyword is highlighted in pale yellow to make the connection between a selected keyword and a result evident.
<fig id="fig5" position="float">
<label>Fig. 5</label>
<caption>
<p>The proposed category user interface. The category list is on the left and the corresponding results are shown on the right.</p>
</caption>
<graphic xlink:href="iwc17-0187-f5.tif"></graphic>
</fig>
</p>
<p>If no category is selected, the result view is empty. However, this is not likely to happen as the special built-in ‘All results’ category is automatically selected after the query has been completed. As the name of the built-in category suggests, selecting that category the user will see all the results retrieved for the query (by default, 150 first results). It is also possible to select many categories. In this case, the results are required to belong to all the selected categories (intersection of the categories).</p>
<p>The order of the results is determined by the search engine. When all the results are shown, the result listing is the same as that returned by the search engine. When a category is selected, the relative order of the results is determined by the search engine, although the results are not likely to be sequential in the original list. The results have an ordinal that shows the position of the result in the rank order of the search engine.</p>
<p>Other parts of the user interface are fairly obvious. On the top, there is a field for entering the query and ‘Search’ and ‘Cancel’ buttons for controlling the search engine. The status bar on the bottom shows additional information like the number of documents found in total and the state of the system (on/off-line).</p>
</sec>
</sec>
<sec>
<label>4</label>
<title>Experiment</title>
<p>We conducted an experiment to evaluate the categorization algorithm and the user interface described in the previous sections. The experiment tested if the new solution differs from a widely accepted solution, in this case the search engine user interface displaying results in a ranked list, in terms of speed, accuracy, and subjective satisfaction.</p>
<sec>
<label>4.1</label>
<title>Participants</title>
<p>There were 20 volunteer participants (8 male, 12 female) in the study. The participants’ average age was 35 years varying from 19 to 57 years. The participants were recruited from the local university. They were students and personnel from seven organizational units. The participants had relatively long histories of general computer use (on average 11.5 years) as well as web use (on average 6 years). Almost all the participants can be regarded as experienced computer users.</p>
</sec>
<sec>
<label>4.2</label>
<title>Apparatus</title>
<p>Two user interfaces were used to access the search results:
<list list-type="order">
<list-item>
<p>
<italic>Category interface</italic>
(category UI,
<xref ref-type="fig" rid="fig6">Fig. 6</xref>
, right window) was the Findex user interface described above with two modifications: (1) multiple selections of keywords was not allowed, and (2) automatic selection of ‘All results’ category was disabled. The latter modification means that the result list was initially empty in each task. Both modifications were made after pilot tests to make the experiment set-up more robust and focused. Category computation produced 15 categories and it was based on the first 150 results.
<fig id="fig6" position="float">
<label>Fig. 6</label>
<caption>
<p>Desktop set-up in the test. Task window on the left and search window on the right. The screen size was 1200×1024 pixels and the search window size was 800×900 pixels.</p>
</caption>
<graphic xlink:href="iwc17-0187-f6.tif"></graphic>
</fig>
<fig id="fig7" position="float">
<label>Fig. 7</label>
<caption>
<p>Reference user interface showing 10 results on a page and radio buttons to navigate between the pages. Note the checkboxes for selecting results.</p>
</caption>
<graphic xlink:href="iwc17-0187-f7.tif"></graphic>
</fig>
</p>
</list-item>
<list-item>
<p>
<italic>Reference interface</italic>
(reference UI,
<xref ref-type="fig" rid="fig7">Fig. 7</xref>
) was a Google web search engine imitation showing results in a ranked list on separate pages, 10 results per page. In the bottom of the window, there were controls to browse the pages in order (Previous and Next buttons) or in random order (radio buttons). The participant had access to 15 pages (150 first results).</p>
</list-item>
</list>
</p>
<p>Both user interfaces showed the results in the same visual format (
<xref ref-type="fig" rid="fig8">Fig. 8</xref>
). The format resembles closely the format of Google, omitting size, category, cached pages, and similar pages features found in Google.
<fig id="fig8" position="float">
<label>Fig. 8</label>
<caption>
<p>Visual format of the individual result elements in the experiment.</p>
</caption>
<graphic xlink:href="iwc17-0187-f8.tif"></graphic>
</fig>
</p>
<p>The reason why we did not use the publicly available Google interface was the controllability of the experiment variables. By using Google, we would have been faced with possible network problems and changing content of the Google database. The instrumentation of different systems could also have caused errors in timings, for instance.</p>
<p>The experiment procedure was automated. During the experiment the computer desktop contained two windows: one displayed the test tasks in textual format (
<italic>task window</italic>
) while the other was the user interface studied (
<italic>search window</italic>
) as shown in
<xref ref-type="fig" rid="fig6">Fig. 6</xref>
. The size and location of both windows were predetermined and fixed.</p>
</sec>
<sec>
<label>4.3</label>
<title>Design</title>
<p>The experiment had
<italic>search user interface</italic>
as the only independent variable with two values: category UI and reference UI. The values of the independent variable were varied within the subjects and thus the analysis was done using repeated measures tools.</p>
<p>As dependent variables we measured: (1) time to accomplish a task in seconds, (2) number of results selected for a task, (3) relevance of each selected result on a three-step scale (relevant, related, not relevant), and (4) subjective attitudes towards the systems.</p>
</sec>
<sec>
<label>4.4</label>
<title>Procedure</title>
<p>The experiments were carried out in a usability laboratory where participants were invited one at a time. Before the experiment the whole procedure was explained to the participants and any questions regarding the set-up were answered. One experiment lasted about 45 min and contained 18 information seeking tasks and three claim rating tasks (one for each condition and one comparing the two).</p>
<p>The experiment was divided into two parts, each consisting of nine information seeking tasks and a claim answering task. One part was carried out with the category interface and other using the reference interface. The order of the parts and the tasks were counterbalanced between the participants. Before each part the participants were explained how the user interface worked and they were allowed to try it by themselves.</p>
<p>The search tasks were based on predefined queries. We adopted this approach from an experiment by
<xref ref-type="bibr" rid="bib8">Dumais et al. (2001)</xref>
where the participants did not formulate the queries themselves either. This approach makes perfect sense, because it removes the vast variability caused by different search capabilities of the participants and lets us measure the performance in the result evaluation phase, which we aim to improve. The tasks were selected to cover multiple interests (e.g. astronomy, cooking, movies, cars, gardening, etc.). The queries were balanced in terms of (1) number of obviously relevant categories, and (2) the position of them in the category list. In addition, a few queries did not have any obviously relevant categories.</p>
<p>For each task, the participants were instructed to first read the task description, then push the ‘Start’ button on the task window and promptly proceed to accomplish the task using the search window. While the participant read the task, the test apparatus fetched the results of the corresponding predefined query, but the user interface was hidden. When the participant pushed the ‘Start’ button, the search window was enabled and the task execution could start immediately. The actual queries were executed before the experiment and saved locally for fast and equal access.</p>
<p>Upon task completion, the participants were instructed to push the ‘Done’ button on the task window. The time between ‘Start’ and ‘Done’ button presses was measured as the total time for the task. The participants were told about this timing scheme. After each task, the participant used the ‘Next’ button in the task window to see the next one. Between the tasks, the participants were in control of the situation being able to take a short break if desired.</p>
<p>The actual task of the participant was to ‘collect as many relevant results for the information seeking task as possible as fast as you can’. The task has two competing goals (speed and accuracy) to simulate realistic settings. In a real situation, users balance between these two goals intuitively based on the various factors (time, importance of the task, available resources, etc.), but in test situation such limiting factors have to be created artificially. In the pilot tests we observed that even if the task had these two competing goals, users tented to favor thoroughness using extended amount of time for a task. To enforce more realistic (faster) behavior, the time for each task was limited to 1 min. The participants were encouraged to utilize their own personal habits in selecting the results.</p>
<p>When the 1 min time limit passed, the search window was automatically disabled and the clock was automatically stopped. The participants were able to proceed to the next task before this time limit if they thought that they had found enough results.</p>
<p>Collecting the results was implemented by adding a check box beside each result item (see
<xref ref-type="fig" rid="fig6 fig7">Figs. 6 and 7</xref>
). Participants were instructed to check the corresponding check box when a result seemed relevant. If a mistake happened, it was possible to clear the mark by clicking the checkbox again.</p>
<p>After the two sets of tasks and ratings, comparison ratings and demographic information were elicited by online forms in the task window. After this the experiment was over.</p>
</sec>
</sec>
<sec>
<label>5</label>
<title>Results</title>
<sec>
<label>5.1</label>
<title>Speed measures</title>
<p>As the total time reserved for each task was limited to 1 min, the plain task times do not tell the whole story about the speed. In both conditions (category and reference user interface), it was very common that the participant used all the time reserved for a task. Thus, the mean task times were very close to 1 min, being 56.6 s (sd=5.5) for the category interface and 58.3 s (sd=3.5) for the reference interface. Due to the experiment set-up, it is understandable that we did not observe a statistically significant difference in task time between the two conditions. Repeated measures analysis of variance (ANOVA) gave
<italic>F</italic>
(1,19)=3.65, n.s.
<fig id="fig9" position="float">
<label>Fig. 9</label>
<caption>
<p>There is a statistically significant difference in speed of use between the compared user interfaces in favor of the category UI. Note also the greater proportion of relevant results found with it.</p>
</caption>
<graphic xlink:href="iwc17-0187-f9.tif"></graphic>
</fig>
</p>
<p>The number of results that the participants collected revealed an interesting finding as
<xref ref-type="fig" rid="fig9">Fig. 9</xref>
shows. On average the participants were able to find 5.5 results per minute (sd=2.0) using the category interface contrasted with 4.1 per minute (sd=1.1) found with the reference interface. This difference in speed is statistically significant (
<italic>F</italic>
(1,19)=12.13,
<italic>p</italic>
<.01).</p>
<p>However, the raw speed of result acquisition may not be the best measure to evaluate the efficiency of a search engine user interface. It may be that the selected results do not really answer the user’s initial information need. In order to estimate the usefulness of the obtained results, we need to consider their accuracy.</p>
</sec>
<sec>
<label>5.2</label>
<title>Accuracy measures</title>
<p>For measuring the quality of the results the participants collected in the experiment, we judged the relevancy of the selected results for each task. Each selected result was assigned one of three relevance values:
<italic>relevant</italic>
,
<italic>related</italic>
, and
<italic>not relevant</italic>
. Ranking was based only on the snippets as were the participants’ selections, because we did not want to include the snippet–document relationship as a factor in the study.</p>
<p>A result was ranked relevant if the snippet clearly indicated that the corresponding page referred had desired content. In practice, for a result to be relevant we required the existence of multiple concepts of the task in the snippet. In contrast, a related snippet was required to refer to the same overall topic, but not to other aspects of the task description. Finally, the not relevant snippets differed from the overall topic of the task. For example, in a task to find pictures of the planet Jupiter, a snippet referring to images of Jupiter was rated as relevant, a snippet referring generally to the planet Jupiter was rated as related, and finally a snippet referring to the Jupiter Research Company was rated as not relevant.</p>
<p>Relevance was measured in four ways. Firstly,
<italic>recall</italic>
states which portion of all relevant results were found. In this study the recall measure was calculated from all the relevant results found by all the participants, not from all the relevant results in the result sets, which is the conventional way. Secondly,
<italic>precision</italic>
tells the proportion of relevant results among selected results. The third and fourth relevance measures concern the accuracy in relation to speed.</p>
<p>Recall and precision measures show a difference between the user interfaces. When using the category interface 62% of the results (sd=13), on average, were relevant whereas the precision with the reference user interface was 49% (sd=15). The difference is statistically significant:
<italic>F</italic>
(1,19)=14.49,
<italic>p</italic>
<.01. The recall measure revealed a similar difference: with the category user interface the participants found on average 33% (sd=4) and with the reference user interface 19% (sd=7) of the relevant results for each task (
<italic>F</italic>
(1,19)=29.88,
<italic>p</italic>
<.01).</p>
<p>The breakdown of the speed measures according to relevance shows also significant differences (see
<xref ref-type="fig" rid="fig9">Fig. 9</xref>
). Using the category user interface, the participants were able to find 3.5 relevant results per minute on average (sd=1.5) whereas the use of the reference user interface yielded 2.0 relevant results per minute (sd=0.8;
<italic>F</italic>
(1,19)=8.20,
<italic>p</italic>
<.01). The speed of acquiring related result (on average 1.4) is the same for both systems, but the category user interface reduces the number of not relevant results (0.4 vs. 0.7 not relevant results per minute, sd=0.3 for both cases;
<italic>F</italic>
(1,19)=11.24,
<italic>p</italic>
<.01).</p>
</sec>
<sec>
<label>5.3</label>
<title>Immediate success measures</title>
<p>In the real world, web users are not typically looking for as many results as possible, but in many cases they are looking for the first good enough answer. To measure this kind of behavior, we analyzed the success of the first few selections.
<xref ref-type="fig" rid="fig10">Fig. 10</xref>
shows the cumulative percentage of the cases where users have found at least one relevant result with the
<italic>n</italic>
th selection. This measure is called immediate accuracy (
<xref ref-type="bibr" rid="bib20">Käki, 2004</xref>
). The most interesting difference is produced already with the first selection, where in 56% of the cases users find a relevant result with the category UI whereas for the reference UI the figure is 40%. The difference is statistically significant,
<italic>F</italic>
(1,19)=12.5,
<italic>p</italic>
<.01. Note that the difference stays virtually the same after the first selection.
<fig id="fig10" position="float">
<label>Fig. 10</label>
<caption>
<p>Cumulative proportion of cases where at least one relevant result has been found with the
<italic>n</italic>
th selection. The participants found the first relevant result sooner with the category UI than with the reference UI.</p>
</caption>
<graphic xlink:href="iwc17-0187-f10.tif"></graphic>
</fig>
<fig id="fig11" position="float">
<label>Fig. 11</label>
<caption>
<p>The precision (proportion of relevant results) of the first selections were higher when using the category user interface.</p>
</caption>
<graphic xlink:href="iwc17-0187-f11.tif"></graphic>
</fig>
</p>
<p>The same effect can be seen in the precision of the first selections in
<xref ref-type="fig" rid="fig11">Fig. 11</xref>
. With the category UI 59% of the first selections and 70% of the second selections are relevant. The corresponding numbers for the reference UI are 42 and 46%. In both cases, the difference is significant as ANOVA gives:
<italic>F</italic>
(1,19)=13.1,
<italic>p</italic>
<.01 and
<italic>F</italic>
(1,19)=25.6,
<italic>p</italic>
<.001, respectively.</p>
<p>The comparison of the corresponding times, however, does not show notable differences. Average time used by the participants to find the first relevant result was about 21 s with both user interfaces. The second relevant result was found in 27 and 35 s (sd=6 and 8) while finding the third relevant result took 30 and 36 s for the category and reference user interfaces, respectively. The difference in acquiring the second relevant result is significant (
<italic>F</italic>
(1,19)=17.70,
<italic>p</italic>
<.01). It is notable, however, that when using the reference UI there were more cases where not a single relevant result was found for a task. Thus the average times for finding the first relevant result are not completely comparable.</p>
</sec>
<sec>
<label>5.4</label>
<title>Subjective measures</title>
<p>The alternative user interfaces were also evaluated using subjective measures. To achieve this, the participants were presented a set of claims (e.g. ‘It was easy to find the results’ and ‘The user interface was confusing’) after using each user interface. The participants responded to the claims on a six point scale from agree (0) to disagree (5). In the end of the experiment, there was another set of claims where participants had to compare the two user interfaces against each other (e.g. ‘Functionality was easier to understand’ and ‘Task execution was harder’). The responses were also here collected along a six point scale but the range was from category interface (0) to reference interface (5).
<fig id="fig12" position="float">
<label>Fig. 12</label>
<caption>
<p>Distribution of the subjective ratings shows more positive attitudes towards the category UI than towards the reference UI.</p>
</caption>
<graphic xlink:href="iwc17-0187-f12.tif"></graphic>
</fig>
</p>
<p>
<xref ref-type="fig" rid="fig12">Fig. 12</xref>
shows the results of the claim answers. In the picture the scales have been normalized to have positive answers on the left and negative ones on the right. As
<xref ref-type="fig" rid="fig12">Fig. 12</xref>
suggests, there is a difference in the subjective ratings of the systems. The analysis of the responses shows a statistically significant difference in the attitudes toward the systems. The median score (median=1) for the category interface indicates more positive attitudes towards it than towards the reference interface (median=2) and Wilcoxon matched-pairs signed-ranks test gives
<italic>Z</italic>
=−2.51,
<italic>p</italic>
<.02 (see
<xref ref-type="fig" rid="fig12">Fig. 12</xref>
for variability). Similarly, when comparing the two systems together we see a statistically significant bias for the category interface. On a six-point scale where 0 stands for the reference UI and 5 for the category UI, the median score was 3.5. One-sample Wilcoxon singed-ranks test gives
<italic>V</italic>
=188,
<italic>p</italic>
<.01.</p>
</sec>
</sec>
<sec>
<label>6</label>
<title>Discussion and conclusions</title>
<p>In the beginning, we had two questions in mind. The first one was finding ways to present search results. The answer was Findex, a system with automatically formed categories that provide an overview of the results in association with a filtering user interface. The second question was whether the proposed solution would work in practice or not. A range of measures collected in the experiment gives us a good reason to believe that Findex does, indeed, perform better than a conventional ranked list 10-results-per-page user interface. The belief is supported by four measurements collected in the experiment.</p>
<p>First, with the category user interface it is possible to browse through more results than with the conventional user interface because the searching speed is higher (5.5 vs. 4.1 results per minute). This is important, since many times the web search results are rather unreliable and it is thus desirable to be able to access alternative results quickly.</p>
<p>Second, and more importantly, the proposed interface not only gives the users more options but gives them more
<italic>relevant</italic>
options. Results showed that the increase in the number of results was due to increased number of relevant results. The measured speed of finding relevant results was about 40% higher with the proposed system compared to the reference solution.</p>
<p>Third, for users in the real situations, the immediate success of the search is also very important. The results show that when using our system, the users found the first relevant result earlier (with fewer selections) than with the reference user interface. The result is very important although we did not measure the time difference in finding the first relevant result. The selection with which the first relevant result is found is crucial, since people tend to visit very few result pages in the web environment (
<xref ref-type="bibr" rid="bib16">Spink et al., 2001</xref>
), typically one or two. According to the results, the proposed system performs better in this kind of search tactic.</p>
<p>Fourth, the results showed that users had positive attitudes towards the system. Although subjective ratings are rather soft measures, they do grasp a very critical issue in a user interface. Even the most sophisticated and efficient user interface is useless if people do not like it. Based on the questionnaire and informal discussions with the participants we have good reasons to believe that the proposed interface could find its audience.</p>
<p>Although the results are very promising, we must bear in mind that improvements come with a certain price. In order to compute the categories, a large number of results must be fetched from a search engine. This takes time in addition to the actual computation of the categories. However, the system is built in a way that it can display first 10 results immediately and then compute the categories in the background while the user can evaluate the first results. This reduces the perceived cost for the operations, but delays can still have an effect on subjective ratings.</p>
</sec>
<sec>
<label>7</label>
<title>Future work</title>
<p>The basic functionality of the category user interface is promising, and we are planning to continue to investigate it. First, the number of categories presented to the user is an interesting question and it was left completely untouched in the present study. Even the number of categories (15) was somewhat arbitrarily chosen. We plan to study what is the effect of the number of categories on the users’ performance.</p>
<p>Second, we are executing a longitudinal study of the usability of the category user interface. In the study we aim to find out if the categories are beneficial in the long run when their use is truly voluntary and happens in natural settings. In addition, it is interesting to see if the category user interface alters the users’ behavior in some ways, for example, could it stimulate the users to make better reformulation of the queries? For the study, an implementation of the user interface has been done for the web environment.</p>
</sec>
</body>
<back>
<ack>
<title>Acknowledgements</title>
<p>This work was supported by the Graduate School in User-Centered Information Technology (UCIT). We would like to thank Poika Isokoski, Johanna Höysniemi and Kari-Jouko Räihä for comments and discussions that contributed to this study, Tomi Heimonen, Natalie Jhaveri, Harri Siirtola.</p>
</ack>
<ref-list>
<title>References</title>
<ref id="bib1">
<nlm-citation citation-type="other">
<person-group person-group-type="author">
<name>
<surname>Chekuri</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Goldwasser</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Raghavan</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Upfal</surname>
<given-names>E.</given-names>
</name>
</person-group>
<comment>1997. Web search using automatic classification, Proceedings of the Sixth International World Wide Web Conference WWW6, Santa Clara, USA.</comment>
</nlm-citation>
</ref>
<ref id="bib2">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Dumais</surname>
<given-names>S.</given-names>
</name>
</person-group>
<article-title>Bringing order to the web: automatically categorizing search results</article-title>
<source>Proceedings of CHI’2000, The Hague, Neatherlands</source>
<year>2000</year>
<publisher-loc>New York</publisher-loc>
<publisher-name>ACM Press</publisher-name>
<comment>pp. 145–152</comment>
</nlm-citation>
</ref>
<ref id="bib3">
<nlm-citation citation-type="other">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Hearst</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Hong</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Lin</surname>
<given-names>J.</given-names>
</name>
</person-group>
<comment>1999. Cha–Cha: a system for organizing Intranet search results, Proceedings of the Second USENIX Symposium on Internet Technologies and SYSTEMS (USITS).</comment>
</nlm-citation>
</ref>
<ref id="bib4">
<nlm-citation citation-type="other">
<person-group person-group-type="author">
<name>
<surname>Cho</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Myaeng</surname>
<given-names>S.</given-names>
</name>
</person-group>
<comment>2000. Visualization of retrieval results using DART, Proceedings of RIAO 2000, Paris, France .</comment>
</nlm-citation>
</ref>
<ref id="bib5">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Cutting</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Karger</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Pedersen</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Tukey</surname>
<given-names>J.</given-names>
</name>
</person-group>
<article-title>Scatter/Gather: a cluster-based approach to browsing large document collections</article-title>
<source>Proceedings of SIGIR 1992, Copenhagen, Denmark</source>
<year>1992</year>
<publisher-loc>New York</publisher-loc>
<publisher-name>ACM Press</publisher-name>
<comment>pp. 318–329</comment>
</nlm-citation>
</ref>
<ref id="bib6">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Dumais</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>H.</given-names>
</name>
</person-group>
<article-title>Hierarchical classification of web content</article-title>
<source>Proceedings of SIGIR 2000, Athens, Greece</source>
<year>2001</year>
<publisher-loc>New York</publisher-loc>
<publisher-name>ACM Press</publisher-name>
<comment>pp. 256–263</comment>
</nlm-citation>
</ref>
<ref id="bib7">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Dumais</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Furnas</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Landauer</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Deerwester</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Harshman</surname>
<given-names>R.</given-names>
</name>
</person-group>
<article-title>Using latent semantic analysis to improve access to textual information</article-title>
<source>Proceedings of CHI’88, Washington DC, USA</source>
<year>1988</year>
<publisher-loc>New York</publisher-loc>
<publisher-name>ACM Press</publisher-name>
<comment>pp. 281–285</comment>
</nlm-citation>
</ref>
<ref id="bib8">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Dumais</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Cutrell</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>H.</given-names>
</name>
</person-group>
<article-title>Optimizing search by showing results in context</article-title>
<source>Proceedings of CHI’2001, Seattle, USA</source>
<year>2001</year>
<publisher-loc>New York</publisher-loc>
<publisher-name>ACM Press</publisher-name>
<comment>pp. 277–284</comment>
</nlm-citation>
</ref>
<ref id="bib9">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Hearst</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Pedersen</surname>
<given-names>J.</given-names>
</name>
</person-group>
<article-title>Reexamining the cluster hypothesis: Scatter/Gather on retrieval results</article-title>
<source>Proceedings of ACM SIGIR’96, Zürich, Switzerland</source>
<year>1996</year>
<publisher-loc>New York</publisher-loc>
<publisher-name>ACM Press</publisher-name>
</nlm-citation>
</ref>
<ref id="bib10">
<nlm-citation citation-type="other">
<person-group person-group-type="author">
<name>
<surname>Jansen</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Spink</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Bateman</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Saracevic</surname>
<given-names>T.</given-names>
</name>
</person-group>
<comment>1998. Searchers, the subjects they search, and sufficiency: a study of a large sample of excite searchers, 1998 World Conference on theWWWand Internet, Orlando, USA 1998.</comment>
</nlm-citation>
</ref>
<ref id="bib11">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Jones</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Paynter</surname>
<given-names>G.</given-names>
</name>
</person-group>
<article-title>Topic-based browsing within a digital library using keyphrases</article-title>
<source>Proceedings of the ACM Conference on Digital Libraries, Berkeley, USA</source>
<year>1999</year>
<publisher-loc>New York</publisher-loc>
<publisher-name>ACM Press</publisher-name>
<comment>pp. 114–121</comment>
</nlm-citation>
</ref>
<ref id="bib12">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jones</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Jones</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Deo</surname>
<given-names>S.</given-names>
</name>
</person-group>
<article-title>Using keyphrases as search result surrogates on small screen devices</article-title>
<source>Personal and Ubiquitous Computing</source>
<year>2004</year>
<volume>8</volume>
<issue>1</issue>
<fpage>55</fpage>
<lpage>68</lpage>
</nlm-citation>
</ref>
<ref id="bib20">
<nlm-citation citation-type="other">
<comment>Käki, M., 2004. Proportional Search Interface usability measures. Proceedings of Nordi CHI 2004, Tampere, Finland. ACM Press, New York, pp.365–372.</comment>
</nlm-citation>
</ref>
<ref id="bib13">
<nlm-citation citation-type="other">
<comment>Maarek, Y., Fagin, R., Ben-Shaul, I., Pelleg, D. (2000). Ephemeral document clustering for web applications. IBM Research Report RJ. 10186, April 2000.</comment>
</nlm-citation>
</ref>
<ref id="bib14">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Pirolli</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Schank</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Hearst</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Diehl</surname>
<given-names>C.</given-names>
</name>
</person-group>
<article-title>Scatter/Gather browsing communicates the topic structure of a very large text collection</article-title>
<source>Proceedings of CHI’96, Vancouver, Canada</source>
<year>1996</year>
<publisher-loc>New York</publisher-loc>
<publisher-name>ACM Press</publisher-name>
<comment>pp. 213–220</comment>
</nlm-citation>
</ref>
<ref id="bib15">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pratt</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Fagan</surname>
<given-names>L.</given-names>
</name>
</person-group>
<article-title>The usefulness of dynamically categorizing search results</article-title>
<source>Journal of the American Medical Informatics Association</source>
<year>2000</year>
<volume>7</volume>
<issue>6</issue>
<fpage>605</fpage>
<lpage>617</lpage>
</nlm-citation>
</ref>
<ref id="bib16">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Spink</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Wolfram</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Jansen</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Saracevic</surname>
<given-names>T.</given-names>
</name>
</person-group>
<article-title>Searching the web: the public and their queries</article-title>
<source>Journal of the American Society for Information Science and Technology</source>
<year>2001</year>
<volume>52</volume>
<issue>3</issue>
<fpage>226</fpage>
<lpage>234</lpage>
</nlm-citation>
</ref>
<ref id="bib17">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Wittenburg</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Sigman</surname>
<given-names>E.</given-names>
</name>
</person-group>
<source>Integration of browsing, searching, and filtering in an applet for web information access</source>
<year>1997</year>
<publisher-loc>Atlanta, USA</publisher-loc>
<publisher-name>CHI’97 Electronic Publications: Late Breaking/Short Talks</publisher-name>
</nlm-citation>
</ref>
<ref id="bib18">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Zamir</surname>
<given-names>O.</given-names>
</name>
<name>
<surname>Etzioni</surname>
<given-names>O.</given-names>
</name>
</person-group>
<article-title>Web document clustering: a feasibility demonstration</article-title>
<source>Proceedings of the 19th International SIGIR Conference on Research and Development in Information Retrieval (SIGIR’98)</source>
<year>1998</year>
<publisher-loc>New York</publisher-loc>
<publisher-name>ACM Press</publisher-name>
<comment>pp. 46–54</comment>
</nlm-citation>
</ref>
<ref id="bib19">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Zamir</surname>
<given-names>O.</given-names>
</name>
<name>
<surname>Etzioni</surname>
<given-names>O.</given-names>
</name>
</person-group>
<article-title>Grouper: a dynamic clustering interface to web search results</article-title>
<source>Proceedings of the Eighth International World Wide Web Conference WWW8, Toronto, Canada</source>
<year>1999</year>
<publisher-loc>Amsterdam</publisher-loc>
<publisher-name>Elsevier</publisher-name>
</nlm-citation>
</ref>
</ref-list>
<fn-group>
<fn id="fn1">
<label>1</label>
<p>Tel. +358 3 215 8871; fax: +358 3 215 6070.</p>
</fn>
</fn-group>
</back>
</article>
</istex:document>
</istex:metadataXml>
<mods version="3.6">
<titleInfo>
<title>Findex: improving search result use through automatic filtering categories</title>
</titleInfo>
<titleInfo type="alternative" contentType="CDATA">
<title>Findex: improving search result use through automatic filtering categories</title>
</titleInfo>
<name type="personal">
<namePart type="given">Mika</namePart>
<namePart type="family">Käki</namePart>
<affiliation>Department of Computer Sciences, University of Tampere, Kehruukoulunkatu 1, FIN-33014 Tampere, Finland</affiliation>
<affiliation>E-mail: mika.kaki@cs.uta.fi</affiliation>
<affiliation>Corresponding author. Tel.: +358 3 215 6181; fax: +358 3 215 6070. E-mail addresses: mika.kaki@cs.uta.fi (M. Käki), anne.aula@cs.uta.fi (A. Aula).</affiliation>
<role>
<roleTerm type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Anne</namePart>
<namePart type="family">Aula</namePart>
<affiliation>Department of Computer Sciences, University of Tampere, Kehruukoulunkatu 1, FIN-33014 Tampere, Finland</affiliation>
<affiliation>Department of Computer Sciences, University of Tampere, Kehruukoulunkatu 1, FIN-33014 Tampere, Finland</affiliation>
<role>
<roleTerm type="text">author</roleTerm>
</role>
</name>
<typeOfResource>text</typeOfResource>
<genre type="research-article" displayLabel="research-article" authority="ISTEX" authorityURI="https://content-type.data.istex.fr" valueURI="https://content-type.data.istex.fr/ark:/67375/XTP-1JC4F85T-7">research-article</genre>
<originInfo>
<publisher>Oxford University Press</publisher>
<place>
<placeTerm type="text">Oxford, UK</placeTerm>
</place>
<dateIssued encoding="w3cdtf">2005-03</dateIssued>
<dateCreated encoding="w3cdtf">2005-01-10</dateCreated>
<copyrightDate encoding="w3cdtf">2005</copyrightDate>
</originInfo>
<abstract>Long result lists from web search engines can be tedious to use. We designed a text categorization algorithm and a filtering user interface to address the problem. The Findex system provides an overview of the results by presenting a list of the most frequent words and phrases as result categories next to the actual results. Selecting a category (word or phrase) filters the result list to show only the results containing it. An experiment with 20 participants was conducted to compare the category design to the de facto standard solution (Google-type ranked list interface). Results show that the users were 25% faster and 21% more accurate with our system. In particular, participants’ speed of finding relevant results was 40% higher with the proposed system. Subjective ratings revealed significantly more positive attitudes towards the new system. Results indicate that the proposed design is feasible and beneficial.</abstract>
<subject>
<genre>Keywords</genre>
<topic>Web search</topic>
<topic>Search user interface</topic>
<topic>Categorization</topic>
<topic>Clustering</topic>
<topic>Information access</topic>
</subject>
<relatedItem type="host">
<titleInfo>
<title>Interacting with Computers</title>
</titleInfo>
<genre type="journal" authority="ISTEX" authorityURI="https://publication-type.data.istex.fr" valueURI="https://publication-type.data.istex.fr/ark:/67375/JMC-0GLKJH51-B">journal</genre>
<identifier type="ISSN">0953-5438</identifier>
<identifier type="eISSN">1873-7951</identifier>
<identifier type="PublisherID">iwc</identifier>
<identifier type="PublisherID-hwp">iwc</identifier>
<part>
<date>2005</date>
<detail type="volume">
<caption>vol.</caption>
<number>17</number>
</detail>
<detail type="issue">
<caption>no.</caption>
<number>2</number>
</detail>
<extent unit="pages">
<start>187</start>
<end>206</end>
</extent>
</part>
</relatedItem>
<relatedItem type="references" displayLabel="bib1">
<titleInfo>
<title>ChekuriC.GoldwasserM.RaghavanP.UpfalE.1997. Web search using automatic classification, Proceedings of the Sixth International World Wide Web Conference WWW6, Santa Clara, USA.</title>
</titleInfo>
<name type="personal">
<namePart type="given">C.</namePart>
<namePart type="family">Chekuri</namePart>
</name>
<name type="personal">
<namePart type="given">M.</namePart>
<namePart type="family">Goldwasser</namePart>
</name>
<name type="personal">
<namePart type="given">P.</namePart>
<namePart type="family">Raghavan</namePart>
</name>
<name type="personal">
<namePart type="given">E.</namePart>
<namePart type="family">Upfal</namePart>
</name>
<genre>other</genre>
<note>1997. Web search using automatic classification, Proceedings of the Sixth International World Wide Web Conference WWW6, Santa Clara, USA.</note>
<note>ChekuriC.GoldwasserM.RaghavanP.UpfalE.1997. Web search using automatic classification, Proceedings of the Sixth International World Wide Web Conference WWW6, Santa Clara, USA.</note>
</relatedItem>
<relatedItem type="references" displayLabel="bib2">
<titleInfo>
<title>Bringing order to the web: automatically categorizing search results</title>
</titleInfo>
<name type="personal">
<namePart type="given">H.</namePart>
<namePart type="family">Chen</namePart>
</name>
<name type="personal">
<namePart type="given">S.</namePart>
<namePart type="family">Dumais</namePart>
</name>
<genre>book</genre>
<note>pp. 145–152</note>
<note>ChenH.DumaisS.Bringing order to the web: automatically categorizing search resultsProceedings of CHI’2000, The Hague, Neatherlands2000New YorkACM Presspp. 145–152</note>
<relatedItem type="host">
<titleInfo>
<title>Proceedings of CHI’2000, The Hague, Neatherlands</title>
</titleInfo>
<originInfo>
<publisher>ACM Press. </publisher>
<place>
<placeTerm type="text">New York</placeTerm>
</place>
</originInfo>
<part>
<date>2000</date>
</part>
</relatedItem>
</relatedItem>
<relatedItem type="references" displayLabel="bib3">
<titleInfo>
<title>ChenM.HearstM.HongJ.LinJ.1999. Cha–Cha: a system for organizing Intranet search results, Proceedings of the Second USENIX Symposium on Internet Technologies and SYSTEMS (USITS).</title>
</titleInfo>
<name type="personal">
<namePart type="given">M.</namePart>
<namePart type="family">Chen</namePart>
</name>
<name type="personal">
<namePart type="given">M.</namePart>
<namePart type="family">Hearst</namePart>
</name>
<name type="personal">
<namePart type="given">J.</namePart>
<namePart type="family">Hong</namePart>
</name>
<name type="personal">
<namePart type="given">J.</namePart>
<namePart type="family">Lin</namePart>
</name>
<genre>other</genre>
<note>1999. Cha–Cha: a system for organizing Intranet search results, Proceedings of the Second USENIX Symposium on Internet Technologies and SYSTEMS (USITS).</note>
<note>ChenM.HearstM.HongJ.LinJ.1999. Cha–Cha: a system for organizing Intranet search results, Proceedings of the Second USENIX Symposium on Internet Technologies and SYSTEMS (USITS).</note>
</relatedItem>
<relatedItem type="references" displayLabel="bib4">
<titleInfo>
<title>ChoE.MyaengS.2000. Visualization of retrieval results using DART, Proceedings of RIAO 2000, Paris, France .</title>
</titleInfo>
<name type="personal">
<namePart type="given">E.</namePart>
<namePart type="family">Cho</namePart>
</name>
<name type="personal">
<namePart type="given">S.</namePart>
<namePart type="family">Myaeng</namePart>
</name>
<genre>other</genre>
<note>2000. Visualization of retrieval results using DART, Proceedings of RIAO 2000, Paris, France .</note>
<note>ChoE.MyaengS.2000. Visualization of retrieval results using DART, Proceedings of RIAO 2000, Paris, France .</note>
</relatedItem>
<relatedItem type="references" displayLabel="bib5">
<titleInfo>
<title>Scatter/Gather: a cluster-based approach to browsing large document collections</title>
</titleInfo>
<name type="personal">
<namePart type="given">D.</namePart>
<namePart type="family">Cutting</namePart>
</name>
<name type="personal">
<namePart type="given">D.</namePart>
<namePart type="family">Karger</namePart>
</name>
<name type="personal">
<namePart type="given">J.</namePart>
<namePart type="family">Pedersen</namePart>
</name>
<name type="personal">
<namePart type="given">J.</namePart>
<namePart type="family">Tukey</namePart>
</name>
<genre>book</genre>
<note>pp. 318–329</note>
<note>CuttingD.KargerD.PedersenJ.TukeyJ.Scatter/Gather: a cluster-based approach to browsing large document collectionsProceedings of SIGIR 1992, Copenhagen, Denmark1992New YorkACM Presspp. 318–329</note>
<relatedItem type="host">
<titleInfo>
<title>Proceedings of SIGIR 1992, Copenhagen, Denmark</title>
</titleInfo>
<originInfo>
<publisher>ACM Press. </publisher>
<place>
<placeTerm type="text">New York</placeTerm>
</place>
</originInfo>
<part>
<date>1992</date>
</part>
</relatedItem>
</relatedItem>
<relatedItem type="references" displayLabel="bib6">
<titleInfo>
<title>Hierarchical classification of web content</title>
</titleInfo>
<name type="personal">
<namePart type="given">S.</namePart>
<namePart type="family">Dumais</namePart>
</name>
<name type="personal">
<namePart type="given">H.</namePart>
<namePart type="family">Chen</namePart>
</name>
<genre>book</genre>
<note>pp. 256–263</note>
<note>DumaisS.ChenH.Hierarchical classification of web contentProceedings of SIGIR 2000, Athens, Greece2001New YorkACM Presspp. 256–263</note>
<relatedItem type="host">
<titleInfo>
<title>Proceedings of SIGIR 2000, Athens, Greece</title>
</titleInfo>
<originInfo>
<publisher>ACM Press. </publisher>
<place>
<placeTerm type="text">New York</placeTerm>
</place>
</originInfo>
<part>
<date>2001</date>
</part>
</relatedItem>
</relatedItem>
<relatedItem type="references" displayLabel="bib7">
<titleInfo>
<title>Using latent semantic analysis to improve access to textual information</title>
</titleInfo>
<name type="personal">
<namePart type="given">S.</namePart>
<namePart type="family">Dumais</namePart>
</name>
<name type="personal">
<namePart type="given">G.</namePart>
<namePart type="family">Furnas</namePart>
</name>
<name type="personal">
<namePart type="given">T.</namePart>
<namePart type="family">Landauer</namePart>
</name>
<name type="personal">
<namePart type="given">S.</namePart>
<namePart type="family">Deerwester</namePart>
</name>
<name type="personal">
<namePart type="given">R.</namePart>
<namePart type="family">Harshman</namePart>
</name>
<genre>book</genre>
<note>pp. 281–285</note>
<note>DumaisS.FurnasG.LandauerT.DeerwesterS.HarshmanR.Using latent semantic analysis to improve access to textual informationProceedings of CHI’88, Washington DC, USA1988New YorkACM Presspp. 281–285</note>
<relatedItem type="host">
<titleInfo>
<title>Proceedings of CHI’88, Washington DC, USA</title>
</titleInfo>
<originInfo>
<publisher>ACM Press. </publisher>
<place>
<placeTerm type="text">New York</placeTerm>
</place>
</originInfo>
<part>
<date>1988</date>
</part>
</relatedItem>
</relatedItem>
<relatedItem type="references" displayLabel="bib8">
<titleInfo>
<title>Optimizing search by showing results in context</title>
</titleInfo>
<name type="personal">
<namePart type="given">S.</namePart>
<namePart type="family">Dumais</namePart>
</name>
<name type="personal">
<namePart type="given">E.</namePart>
<namePart type="family">Cutrell</namePart>
</name>
<name type="personal">
<namePart type="given">H.</namePart>
<namePart type="family">Chen</namePart>
</name>
<genre>book</genre>
<note>pp. 277–284</note>
<note>DumaisS.CutrellE.ChenH.Optimizing search by showing results in contextProceedings of CHI’2001, Seattle, USA2001New YorkACM Presspp. 277–284</note>
<relatedItem type="host">
<titleInfo>
<title>Proceedings of CHI’2001, Seattle, USA</title>
</titleInfo>
<originInfo>
<publisher>ACM Press. </publisher>
<place>
<placeTerm type="text">New York</placeTerm>
</place>
</originInfo>
<part>
<date>2001</date>
</part>
</relatedItem>
</relatedItem>
<relatedItem type="references" displayLabel="bib9">
<titleInfo>
<title>Reexamining the cluster hypothesis: Scatter/Gather on retrieval results</title>
</titleInfo>
<name type="personal">
<namePart type="given">M.</namePart>
<namePart type="family">Hearst</namePart>
</name>
<name type="personal">
<namePart type="given">J.</namePart>
<namePart type="family">Pedersen</namePart>
</name>
<genre>book</genre>
<note>HearstM.PedersenJ.Reexamining the cluster hypothesis: Scatter/Gather on retrieval resultsProceedings of ACM SIGIR’96, Zürich, Switzerland1996New YorkACM Press</note>
<relatedItem type="host">
<titleInfo>
<title>Proceedings of ACM SIGIR’96, Zürich, Switzerland</title>
</titleInfo>
<originInfo>
<publisher>ACM Press. </publisher>
<place>
<placeTerm type="text">New York</placeTerm>
</place>
</originInfo>
<part>
<date>1996</date>
</part>
</relatedItem>
</relatedItem>
<relatedItem type="references" displayLabel="bib10">
<titleInfo>
<title>JansenB.SpinkA.BatemanJ.SaracevicT.1998. Searchers, the subjects they search, and sufficiency: a study of a large sample of excite searchers, 1998 World Conference on theWWWand Internet, Orlando, USA 1998.</title>
</titleInfo>
<name type="personal">
<namePart type="given">B.</namePart>
<namePart type="family">Jansen</namePart>
</name>
<name type="personal">
<namePart type="given">A.</namePart>
<namePart type="family">Spink</namePart>
</name>
<name type="personal">
<namePart type="given">J.</namePart>
<namePart type="family">Bateman</namePart>
</name>
<name type="personal">
<namePart type="given">T.</namePart>
<namePart type="family">Saracevic</namePart>
</name>
<genre>other</genre>
<note>1998. Searchers, the subjects they search, and sufficiency: a study of a large sample of excite searchers, 1998 World Conference on theWWWand Internet, Orlando, USA 1998.</note>
<note>JansenB.SpinkA.BatemanJ.SaracevicT.1998. Searchers, the subjects they search, and sufficiency: a study of a large sample of excite searchers, 1998 World Conference on theWWWand Internet, Orlando, USA 1998.</note>
</relatedItem>
<relatedItem type="references" displayLabel="bib11">
<titleInfo>
<title>Topic-based browsing within a digital library using keyphrases</title>
</titleInfo>
<name type="personal">
<namePart type="given">S.</namePart>
<namePart type="family">Jones</namePart>
</name>
<name type="personal">
<namePart type="given">G.</namePart>
<namePart type="family">Paynter</namePart>
</name>
<genre>book</genre>
<note>pp. 114–121</note>
<note>JonesS.PaynterG.Topic-based browsing within a digital library using keyphrasesProceedings of the ACM Conference on Digital Libraries, Berkeley, USA1999New YorkACM Presspp. 114–121</note>
<relatedItem type="host">
<titleInfo>
<title>Proceedings of the ACM Conference on Digital Libraries, Berkeley, USA</title>
</titleInfo>
<originInfo>
<publisher>ACM Press. </publisher>
<place>
<placeTerm type="text">New York</placeTerm>
</place>
</originInfo>
<part>
<date>1999</date>
</part>
</relatedItem>
</relatedItem>
<relatedItem type="references" displayLabel="bib12">
<titleInfo>
<title>Using keyphrases as search result surrogates on small screen devices</title>
</titleInfo>
<name type="personal">
<namePart type="given">S.</namePart>
<namePart type="family">Jones</namePart>
</name>
<name type="personal">
<namePart type="given">M.</namePart>
<namePart type="family">Jones</namePart>
</name>
<name type="personal">
<namePart type="given">S.</namePart>
<namePart type="family">Deo</namePart>
</name>
<genre>journal</genre>
<note>JonesS.JonesM.DeoS.Using keyphrases as search result surrogates on small screen devicesPersonal and Ubiquitous Computing2004815568</note>
<relatedItem type="host">
<titleInfo>
<title>Personal and Ubiquitous Computing</title>
</titleInfo>
<part>
<date>2004</date>
<detail type="volume">
<caption>vol.</caption>
<number>8</number>
</detail>
<detail type="issue">
<caption>no.</caption>
<number>1</number>
</detail>
<extent unit="pages">
<start>55</start>
<end>68</end>
</extent>
</part>
</relatedItem>
</relatedItem>
<relatedItem type="references" displayLabel="bib20">
<titleInfo>
<title>Käki, M., 2004. Proportional Search Interface usability measures. Proceedings of Nordi CHI 2004, Tampere, Finland. ACM Press, New York, pp.365–372.</title>
</titleInfo>
<genre>other</genre>
<note>Käki, M., 2004. Proportional Search Interface usability measures. Proceedings of Nordi CHI 2004, Tampere, Finland. ACM Press, New York, pp.365–372.</note>
<note>Käki, M., 2004. Proportional Search Interface usability measures. Proceedings of Nordi CHI 2004, Tampere, Finland. ACM Press, New York, pp.365–372.</note>
</relatedItem>
<relatedItem type="references" displayLabel="bib13">
<titleInfo>
<title>Maarek, Y., Fagin, R., Ben-Shaul, I., Pelleg, D. (2000). Ephemeral document clustering for web applications. IBM Research Report RJ. 10186, April 2000.</title>
</titleInfo>
<genre>other</genre>
<note>Maarek, Y., Fagin, R., Ben-Shaul, I., Pelleg, D. (2000). Ephemeral document clustering for web applications. IBM Research Report RJ. 10186, April 2000.</note>
<note>Maarek, Y., Fagin, R., Ben-Shaul, I., Pelleg, D. (2000). Ephemeral document clustering for web applications. IBM Research Report RJ. 10186, April 2000.</note>
</relatedItem>
<relatedItem type="references" displayLabel="bib14">
<titleInfo>
<title>Scatter/Gather browsing communicates the topic structure of a very large text collection</title>
</titleInfo>
<name type="personal">
<namePart type="given">P.</namePart>
<namePart type="family">Pirolli</namePart>
</name>
<name type="personal">
<namePart type="given">P.</namePart>
<namePart type="family">Schank</namePart>
</name>
<name type="personal">
<namePart type="given">M.</namePart>
<namePart type="family">Hearst</namePart>
</name>
<name type="personal">
<namePart type="given">C.</namePart>
<namePart type="family">Diehl</namePart>
</name>
<genre>book</genre>
<note>pp. 213–220</note>
<note>PirolliP.SchankP.HearstM.DiehlC.Scatter/Gather browsing communicates the topic structure of a very large text collectionProceedings of CHI’96, Vancouver, Canada1996New YorkACM Presspp. 213–220</note>
<relatedItem type="host">
<titleInfo>
<title>Proceedings of CHI’96, Vancouver, Canada</title>
</titleInfo>
<originInfo>
<publisher>ACM Press. </publisher>
<place>
<placeTerm type="text">New York</placeTerm>
</place>
</originInfo>
<part>
<date>1996</date>
</part>
</relatedItem>
</relatedItem>
<relatedItem type="references" displayLabel="bib15">
<titleInfo>
<title>The usefulness of dynamically categorizing search results</title>
</titleInfo>
<name type="personal">
<namePart type="given">W.</namePart>
<namePart type="family">Pratt</namePart>
</name>
<name type="personal">
<namePart type="given">L.</namePart>
<namePart type="family">Fagan</namePart>
</name>
<genre>journal</genre>
<note>PrattW.FaganL.The usefulness of dynamically categorizing search resultsJournal of the American Medical Informatics Association200076605617</note>
<relatedItem type="host">
<titleInfo>
<title>Journal of the American Medical Informatics Association</title>
</titleInfo>
<part>
<date>2000</date>
<detail type="volume">
<caption>vol.</caption>
<number>7</number>
</detail>
<detail type="issue">
<caption>no.</caption>
<number>6</number>
</detail>
<extent unit="pages">
<start>605</start>
<end>617</end>
</extent>
</part>
</relatedItem>
</relatedItem>
<relatedItem type="references" displayLabel="bib16">
<titleInfo>
<title>Searching the web: the public and their queries</title>
</titleInfo>
<name type="personal">
<namePart type="given">A.</namePart>
<namePart type="family">Spink</namePart>
</name>
<name type="personal">
<namePart type="given">D.</namePart>
<namePart type="family">Wolfram</namePart>
</name>
<name type="personal">
<namePart type="given">M.</namePart>
<namePart type="family">Jansen</namePart>
</name>
<name type="personal">
<namePart type="given">T.</namePart>
<namePart type="family">Saracevic</namePart>
</name>
<genre>journal</genre>
<note>SpinkA.WolframD.JansenM.SaracevicT.Searching the web: the public and their queriesJournal of the American Society for Information Science and Technology2001523226234</note>
<relatedItem type="host">
<titleInfo>
<title>Journal of the American Society for Information Science and Technology</title>
</titleInfo>
<part>
<date>2001</date>
<detail type="volume">
<caption>vol.</caption>
<number>52</number>
</detail>
<detail type="issue">
<caption>no.</caption>
<number>3</number>
</detail>
<extent unit="pages">
<start>226</start>
<end>234</end>
</extent>
</part>
</relatedItem>
</relatedItem>
<relatedItem type="references" displayLabel="bib17">
<titleInfo>
<title>Integration of browsing, searching, and filtering in an applet for web information access</title>
</titleInfo>
<name type="personal">
<namePart type="given">K.</namePart>
<namePart type="family">Wittenburg</namePart>
</name>
<name type="personal">
<namePart type="given">E.</namePart>
<namePart type="family">Sigman</namePart>
</name>
<originInfo>
<publisher>CHI’97 Electronic Publications: Late Breaking/Short Talks. </publisher>
<place>
<placeTerm type="text">Atlanta, USA</placeTerm>
</place>
</originInfo>
<genre>book</genre>
<note>WittenburgK.SigmanE.Integration of browsing, searching, and filtering in an applet for web information access1997Atlanta, USACHI’97 Electronic Publications: Late Breaking/Short Talks</note>
<part>
<date>1997</date>
</part>
</relatedItem>
<relatedItem type="references" displayLabel="bib18">
<titleInfo>
<title>Web document clustering: a feasibility demonstration</title>
</titleInfo>
<name type="personal">
<namePart type="given">O.</namePart>
<namePart type="family">Zamir</namePart>
</name>
<name type="personal">
<namePart type="given">O.</namePart>
<namePart type="family">Etzioni</namePart>
</name>
<genre>book</genre>
<note>pp. 46–54</note>
<note>ZamirO.EtzioniO.Web document clustering: a feasibility demonstrationProceedings of the 19th International SIGIR Conference on Research and Development in Information Retrieval (SIGIR’98)1998New YorkACM Presspp. 46–54</note>
<relatedItem type="host">
<titleInfo>
<title>Proceedings of the 19th International SIGIR Conference on Research and Development in Information Retrieval (SIGIR’98)</title>
</titleInfo>
<originInfo>
<publisher>ACM Press. </publisher>
<place>
<placeTerm type="text">New York</placeTerm>
</place>
</originInfo>
<part>
<date>1998</date>
</part>
</relatedItem>
</relatedItem>
<relatedItem type="references" displayLabel="bib19">
<titleInfo>
<title>Grouper: a dynamic clustering interface to web search results</title>
</titleInfo>
<name type="personal">
<namePart type="given">O.</namePart>
<namePart type="family">Zamir</namePart>
</name>
<name type="personal">
<namePart type="given">O.</namePart>
<namePart type="family">Etzioni</namePart>
</name>
<genre>book</genre>
<note>ZamirO.EtzioniO.Grouper: a dynamic clustering interface to web search resultsProceedings of the Eighth International World Wide Web Conference WWW8, Toronto, Canada1999AmsterdamElsevier</note>
<relatedItem type="host">
<titleInfo>
<title>Proceedings of the Eighth International World Wide Web Conference WWW8, Toronto, Canada</title>
</titleInfo>
<originInfo>
<publisher>Elsevier. </publisher>
<place>
<placeTerm type="text">Amsterdam</placeTerm>
</place>
</originInfo>
<part>
<date>1999</date>
</part>
</relatedItem>
</relatedItem>
<identifier type="istex">F70742F012AD9FF6A3BC4F946E17F85BCBBF22E0</identifier>
<identifier type="ark">ark:/67375/HXZ-WSTC58GQ-9</identifier>
<identifier type="DOI">10.1016/j.intcom.2005.01.001</identifier>
<accessCondition type="use and reproduction" contentType="copyright">© 2005 Elsevier B.V. All rights reserved.</accessCondition>
<recordInfo>
<recordContentSource authority="ISTEX" authorityURI="https://loaded-corpus.data.istex.fr" valueURI="https://loaded-corpus.data.istex.fr/ark:/67375/XBH-GTWS0RDP-M">oup</recordContentSource>
<recordOrigin>Converted from (version 1.2.10) to MODS version 3.6.</recordOrigin>
<recordCreationDate encoding="w3cdtf">2020-04-16</recordCreationDate>
</recordInfo>
</mods>
<json:item>
<extension>json</extension>
<original>false</original>
<mimetype>application/json</mimetype>
<uri>https://api.istex.fr/ark:/67375/HXZ-WSTC58GQ-9/record.json</uri>
</json:item>
</metadata>
<covers>
<json:item>
<extension>gif</extension>
<original>true</original>
<mimetype>image/gif</mimetype>
<uri>https://api.istex.fr/ark:/67375/HXZ-WSTC58GQ-9/covers.gif</uri>
</json:item>
<json:item>
<extension>tiff</extension>
<original>true</original>
<mimetype>image/tiff</mimetype>
<uri>https://api.istex.fr/document/F70742F012AD9FF6A3BC4F946E17F85BCBBF22E0/covers/tiff</uri>
</json:item>
</covers>
<annexes>
<json:item>
<extension>jpeg</extension>
<original>true</original>
<mimetype>image/jpeg</mimetype>
<uri>https://api.istex.fr/ark:/67375/HXZ-WSTC58GQ-9/annexes.jpeg</uri>
</json:item>
</annexes>
<serie></serie>
</istex>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Sante/explor/SrasV1/Data/Istex/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001C98 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Istex/Corpus/biblio.hfd -nk 001C98 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Sante
   |area=    SrasV1
   |flux=    Istex
   |étape=   Corpus
   |type=    RBID
   |clé=     ISTEX:F70742F012AD9FF6A3BC4F946E17F85BCBBF22E0
   |texte=   Findex: improving search result use through automatic filtering categories
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Tue Apr 28 14:49:16 2020. Site generation: Sat Mar 27 22:06:49 2021