Towards a Statistically Semantic Web
Identifieur interne : 000A70 ( Main/Curation ); précédent : 000A69; suivant : 000A71Towards a Statistically Semantic Web
Auteurs : Gerhard Weikum [Allemagne] ; Jens Graupmann [Allemagne] ; Ralf Schenkel [Allemagne] ; Martin Theobald [Allemagne]Source :
- Lecture Notes in Computer Science [ 0302-9743 ] ; 2004.
English descriptors
- Teeft :
- Algorithm, American society, Anja theobald, Annotation, Background knowledge, Christian zimmer, Click streams, Complex operator trees, Corpus statistics, Data collections, Database, Digital libraries, Document collections, Gene expression data, Gerhard, Gerhard weikum, High dynamics, Holger meuss, Index list, Index lists, Information extraction, Information retrieval, Information science, Information search, Intelligent search, International workshop, Jens graupmann, Keyword search, Large result lists, Latter case, Lise getoor, Lower bounds, Markov models, Martin theobald, Matthias bender, Natural language, Norbert fuhr, Ontological similarity, Ontology, Original query, Previous section, Probabilistic, Probabilistic guarantees, Query, Query conditions, Query expansion, Query languages, Query logs, Query processing, Ralf schenkel, Random accesses, Random surfer, Random variables, Relevance feedback, Relevant pages, Retrieval, Rich body, Schema, Score distributions, Scottish nobleman, Search engine, Search engines, Semantic, Semantic relationships, Sigir, Sigir workshop, Similarity operator, Springer, Statistical information, Statistical reasoning, Steffen staab, Theobald, Torsten schlieder, Total score, Unknown scores, Upper bounds, User, User behavior, Valuable information, Vldb, Vldb journal, Weikum, Wordnet, Wordnet thesaurus.
Abstract
Abstract: The envisioned Semantic Web aims to provide richly annotated and explicitly structured Web pages in XML, RDF, or description logics, based upon underlying ontologies and thesauri. Ideally, this should enable a wealth of query processing and semantic reasoning capabilities using XQuery and logical inference engines. However, we believe that the diversity and uncertainty of terminologies and schema-like annotations will make precise querying on a Web scale extremely elusive if not hopeless, and the same argument holds for large-scale dynamic federations of Deep Web sources. Therefore, ontology-based reasoning and querying needs to be enhanced by statistical means, leading to relevance-ranked lists as query results. This paper presents steps towards such a “statistically semantic” Web and outlines technical challenges. We discuss how statistically quantified ontological relations can be exploited in XML retrieval, how statistics can help in making Web-scale search efficient, and how statistical information extracted from users’ query logs and click streams can be leveraged for better search result ranking. We believe these are decisive issues for improving the quality of next-generation search engines for intranets, digital libraries, and the Web, and they are crucial also for peer-to-peer collaborative Web search.
Url:
DOI: 10.1007/978-3-540-30464-7_2
Links toward previous steps (curation, corpus...)
- to stream Istex, to step Corpus: Pour aller vers cette notice dans l'étape Curation :001744
- to stream Istex, to step Curation: Pour aller vers cette notice dans l'étape Curation :001637
- to stream Istex, to step Checkpoint: Pour aller vers cette notice dans l'étape Curation :000865
- to stream Main, to step Merge: Pour aller vers cette notice dans l'étape Curation :000A70
Links to Exploration step
ISTEX:E281C55D5902501C698FDE7C2BDFE0E9C38648E8Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Towards a Statistically Semantic Web</title>
<author><name sortKey="Weikum, Gerhard" sort="Weikum, Gerhard" uniqKey="Weikum G" first="Gerhard" last="Weikum">Gerhard Weikum</name>
</author>
<author><name sortKey="Graupmann, Jens" sort="Graupmann, Jens" uniqKey="Graupmann J" first="Jens" last="Graupmann">Jens Graupmann</name>
</author>
<author><name sortKey="Schenkel, Ralf" sort="Schenkel, Ralf" uniqKey="Schenkel R" first="Ralf" last="Schenkel">Ralf Schenkel</name>
</author>
<author><name sortKey="Theobald, Martin" sort="Theobald, Martin" uniqKey="Theobald M" first="Martin" last="Theobald">Martin Theobald</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:E281C55D5902501C698FDE7C2BDFE0E9C38648E8</idno>
<date when="2004" year="2004">2004</date>
<idno type="doi">10.1007/978-3-540-30464-7_2</idno>
<idno type="url">https://api.istex.fr/document/E281C55D5902501C698FDE7C2BDFE0E9C38648E8/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">001744</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Corpus" wicri:corpus="ISTEX">001744</idno>
<idno type="wicri:Area/Istex/Curation">001637</idno>
<idno type="wicri:Area/Istex/Checkpoint">000865</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Checkpoint">000865</idno>
<idno type="wicri:doubleKey">0302-9743:2004:Weikum G:towards:a:statistically</idno>
<idno type="wicri:Area/Main/Merge">000A70</idno>
<idno type="wicri:Area/Main/Curation">000A70</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Towards a Statistically Semantic Web</title>
<author><name sortKey="Weikum, Gerhard" sort="Weikum, Gerhard" uniqKey="Weikum G" first="Gerhard" last="Weikum">Gerhard Weikum</name>
<affiliation wicri:level="1"><country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Max-Planck Institute of Computer Science, Saarbruecken</wicri:regionArea>
<wicri:noRegion>Saarbruecken</wicri:noRegion>
<wicri:noRegion>Saarbruecken</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Graupmann, Jens" sort="Graupmann, Jens" uniqKey="Graupmann J" first="Jens" last="Graupmann">Jens Graupmann</name>
<affiliation wicri:level="1"><country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Max-Planck Institute of Computer Science, Saarbruecken</wicri:regionArea>
<wicri:noRegion>Saarbruecken</wicri:noRegion>
<wicri:noRegion>Saarbruecken</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Schenkel, Ralf" sort="Schenkel, Ralf" uniqKey="Schenkel R" first="Ralf" last="Schenkel">Ralf Schenkel</name>
<affiliation wicri:level="1"><country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Max-Planck Institute of Computer Science, Saarbruecken</wicri:regionArea>
<wicri:noRegion>Saarbruecken</wicri:noRegion>
<wicri:noRegion>Saarbruecken</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Theobald, Martin" sort="Theobald, Martin" uniqKey="Theobald M" first="Martin" last="Theobald">Martin Theobald</name>
<affiliation wicri:level="1"><country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Max-Planck Institute of Computer Science, Saarbruecken</wicri:regionArea>
<wicri:noRegion>Saarbruecken</wicri:noRegion>
<wicri:noRegion>Saarbruecken</wicri:noRegion>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2004</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="Teeft" xml:lang="en"><term>Algorithm</term>
<term>American society</term>
<term>Anja theobald</term>
<term>Annotation</term>
<term>Background knowledge</term>
<term>Christian zimmer</term>
<term>Click streams</term>
<term>Complex operator trees</term>
<term>Corpus statistics</term>
<term>Data collections</term>
<term>Database</term>
<term>Digital libraries</term>
<term>Document collections</term>
<term>Gene expression data</term>
<term>Gerhard</term>
<term>Gerhard weikum</term>
<term>High dynamics</term>
<term>Holger meuss</term>
<term>Index list</term>
<term>Index lists</term>
<term>Information extraction</term>
<term>Information retrieval</term>
<term>Information science</term>
<term>Information search</term>
<term>Intelligent search</term>
<term>International workshop</term>
<term>Jens graupmann</term>
<term>Keyword search</term>
<term>Large result lists</term>
<term>Latter case</term>
<term>Lise getoor</term>
<term>Lower bounds</term>
<term>Markov models</term>
<term>Martin theobald</term>
<term>Matthias bender</term>
<term>Natural language</term>
<term>Norbert fuhr</term>
<term>Ontological similarity</term>
<term>Ontology</term>
<term>Original query</term>
<term>Previous section</term>
<term>Probabilistic</term>
<term>Probabilistic guarantees</term>
<term>Query</term>
<term>Query conditions</term>
<term>Query expansion</term>
<term>Query languages</term>
<term>Query logs</term>
<term>Query processing</term>
<term>Ralf schenkel</term>
<term>Random accesses</term>
<term>Random surfer</term>
<term>Random variables</term>
<term>Relevance feedback</term>
<term>Relevant pages</term>
<term>Retrieval</term>
<term>Rich body</term>
<term>Schema</term>
<term>Score distributions</term>
<term>Scottish nobleman</term>
<term>Search engine</term>
<term>Search engines</term>
<term>Semantic</term>
<term>Semantic relationships</term>
<term>Sigir</term>
<term>Sigir workshop</term>
<term>Similarity operator</term>
<term>Springer</term>
<term>Statistical information</term>
<term>Statistical reasoning</term>
<term>Steffen staab</term>
<term>Theobald</term>
<term>Torsten schlieder</term>
<term>Total score</term>
<term>Unknown scores</term>
<term>Upper bounds</term>
<term>User</term>
<term>User behavior</term>
<term>Valuable information</term>
<term>Vldb</term>
<term>Vldb journal</term>
<term>Weikum</term>
<term>Wordnet</term>
<term>Wordnet thesaurus</term>
</keywords>
</textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: The envisioned Semantic Web aims to provide richly annotated and explicitly structured Web pages in XML, RDF, or description logics, based upon underlying ontologies and thesauri. Ideally, this should enable a wealth of query processing and semantic reasoning capabilities using XQuery and logical inference engines. However, we believe that the diversity and uncertainty of terminologies and schema-like annotations will make precise querying on a Web scale extremely elusive if not hopeless, and the same argument holds for large-scale dynamic federations of Deep Web sources. Therefore, ontology-based reasoning and querying needs to be enhanced by statistical means, leading to relevance-ranked lists as query results. This paper presents steps towards such a “statistically semantic” Web and outlines technical challenges. We discuss how statistically quantified ontological relations can be exploited in XML retrieval, how statistics can help in making Web-scale search efficient, and how statistical information extracted from users’ query logs and click streams can be leveraged for better search result ranking. We believe these are decisive issues for improving the quality of next-generation search engines for intranets, digital libraries, and the Web, and they are crucial also for peer-to-peer collaborative Web search.</div>
</front>
</TEI>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Wicri/Sarre/explor/MusicSarreV3/Data/Main/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000A70 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Curation/biblio.hfd -nk 000A70 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Wicri/Sarre |area= MusicSarreV3 |flux= Main |étape= Curation |type= RBID |clé= ISTEX:E281C55D5902501C698FDE7C2BDFE0E9C38648E8 |texte= Towards a Statistically Semantic Web }}
This area was generated with Dilib version V0.6.33. |