MusicSarreV3, Main, Curation, bibRecord, 000A70

Towards a Statistically Semantic Web

Identifieur interne : 000A70 ( Main/Curation ); précédent : 000A69; suivant : 000A71

Towards a Statistically Semantic Web

Auteurs : Gerhard Weikum [Allemagne] ; Jens Graupmann [Allemagne] ; Ralf Schenkel [Allemagne] ; Martin Theobald [Allemagne]

Source :

Lecture Notes in Computer Science [ 0302-9743 ] ; 2004.

RBID : ISTEX:E281C55D5902501C698FDE7C2BDFE0E9C38648E8

English descriptors

Teeft :
- Algorithm, American society, Anja theobald, Annotation, Background knowledge, Christian zimmer, Click streams, Complex operator trees, Corpus statistics, Data collections, Database, Digital libraries, Document collections, Gene expression data, Gerhard, Gerhard weikum, High dynamics, Holger meuss, Index list, Index lists, Information extraction, Information retrieval, Information science, Information search, Intelligent search, International workshop, Jens graupmann, Keyword search, Large result lists, Latter case, Lise getoor, Lower bounds, Markov models, Martin theobald, Matthias bender, Natural language, Norbert fuhr, Ontological similarity, Ontology, Original query, Previous section, Probabilistic, Probabilistic guarantees, Query, Query conditions, Query expansion, Query languages, Query logs, Query processing, Ralf schenkel, Random accesses, Random surfer, Random variables, Relevance feedback, Relevant pages, Retrieval, Rich body, Schema, Score distributions, Scottish nobleman, Search engine, Search engines, Semantic, Semantic relationships, Sigir, Sigir workshop, Similarity operator, Springer, Statistical information, Statistical reasoning, Steffen staab, Theobald, Torsten schlieder, Total score, Unknown scores, Upper bounds, User, User behavior, Valuable information, Vldb, Vldb journal, Weikum, Wordnet, Wordnet thesaurus.

Abstract

Abstract: The envisioned Semantic Web aims to provide richly annotated and explicitly structured Web pages in XML, RDF, or description logics, based upon underlying ontologies and thesauri. Ideally, this should enable a wealth of query processing and semantic reasoning capabilities using XQuery and logical inference engines. However, we believe that the diversity and uncertainty of terminologies and schema-like annotations will make precise querying on a Web scale extremely elusive if not hopeless, and the same argument holds for large-scale dynamic federations of Deep Web sources. Therefore, ontology-based reasoning and querying needs to be enhanced by statistical means, leading to relevance-ranked lists as query results. This paper presents steps towards such a “statistically semantic” Web and outlines technical challenges. We discuss how statistically quantified ontological relations can be exploited in XML retrieval, how statistics can help in making Web-scale search efficient, and how statistical information extracted from users’ query logs and click streams can be leveraged for better search result ranking. We believe these are decisive issues for improving the quality of next-generation search engines for intranets, digital libraries, and the Web, and they are crucial also for peer-to-peer collaborative Web search.

Url:

https://api.istex.fr/document/E281C55D5902501C698FDE7C2BDFE0E9C38648E8/fulltext/pdf

DOI: 10.1007/978-3-540-30464-7_2

Links toward previous steps (curation, corpus...)

to stream Istex, to step Corpus: Pour aller vers cette notice dans l'étape Curation :001744
to stream Istex, to step Curation: Pour aller vers cette notice dans l'étape Curation :001637
to stream Istex, to step Checkpoint: Pour aller vers cette notice dans l'étape Curation :000865
to stream Main, to step Merge: Pour aller vers cette notice dans l'étape Curation :000A70

Links to Exploration step

ISTEX:E281C55D5902501C698FDE7C2BDFE0E9C38648E8

Le document en format XML

<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Towards a Statistically Semantic Web</title>
<author><name sortKey="Weikum, Gerhard" sort="Weikum, Gerhard" uniqKey="Weikum G" first="Gerhard" last="Weikum">Gerhard Weikum</name>
</author>
<author><name sortKey="Graupmann, Jens" sort="Graupmann, Jens" uniqKey="Graupmann J" first="Jens" last="Graupmann">Jens Graupmann</name>
</author>
<author><name sortKey="Schenkel, Ralf" sort="Schenkel, Ralf" uniqKey="Schenkel R" first="Ralf" last="Schenkel">Ralf Schenkel</name>
</author>
<author><name sortKey="Theobald, Martin" sort="Theobald, Martin" uniqKey="Theobald M" first="Martin" last="Theobald">Martin Theobald</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:E281C55D5902501C698FDE7C2BDFE0E9C38648E8</idno>
<date when="2004" year="2004">2004</date>
<idno type="doi">10.1007/978-3-540-30464-7_2</idno>
<idno type="url">https://api.istex.fr/document/E281C55D5902501C698FDE7C2BDFE0E9C38648E8/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">001744</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Corpus" wicri:corpus="ISTEX">001744</idno>
<idno type="wicri:Area/Istex/Curation">001637</idno>
<idno type="wicri:Area/Istex/Checkpoint">000865</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Checkpoint">000865</idno>
<idno type="wicri:doubleKey">0302-9743:2004:Weikum G:towards:a:statistically</idno>
<idno type="wicri:Area/Main/Merge">000A70</idno>
<idno type="wicri:Area/Main/Curation">000A70</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Towards a Statistically Semantic Web</title>
<author><name sortKey="Weikum, Gerhard" sort="Weikum, Gerhard" uniqKey="Weikum G" first="Gerhard" last="Weikum">Gerhard Weikum</name>
<affiliation wicri:level="1"><country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Max-Planck Institute of Computer Science, Saarbruecken</wicri:regionArea>
<wicri:noRegion>Saarbruecken</wicri:noRegion>
<wicri:noRegion>Saarbruecken</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Graupmann, Jens" sort="Graupmann, Jens" uniqKey="Graupmann J" first="Jens" last="Graupmann">Jens Graupmann</name>
<affiliation wicri:level="1"><country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Max-Planck Institute of Computer Science, Saarbruecken</wicri:regionArea>
<wicri:noRegion>Saarbruecken</wicri:noRegion>
<wicri:noRegion>Saarbruecken</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Schenkel, Ralf" sort="Schenkel, Ralf" uniqKey="Schenkel R" first="Ralf" last="Schenkel">Ralf Schenkel</name>
<affiliation wicri:level="1"><country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Max-Planck Institute of Computer Science, Saarbruecken</wicri:regionArea>
<wicri:noRegion>Saarbruecken</wicri:noRegion>
<wicri:noRegion>Saarbruecken</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Theobald, Martin" sort="Theobald, Martin" uniqKey="Theobald M" first="Martin" last="Theobald">Martin Theobald</name>
<affiliation wicri:level="1"><country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Max-Planck Institute of Computer Science, Saarbruecken</wicri:regionArea>
<wicri:noRegion>Saarbruecken</wicri:noRegion>
<wicri:noRegion>Saarbruecken</wicri:noRegion>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2004</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="Teeft" xml:lang="en"><term>Algorithm</term>
<term>American society</term>
<term>Anja theobald</term>
<term>Annotation</term>
<term>Background knowledge</term>
<term>Christian zimmer</term>
<term>Click streams</term>
<term>Complex operator trees</term>
<term>Corpus statistics</term>
<term>Data collections</term>
<term>Database</term>
<term>Digital libraries</term>
<term>Document collections</term>
<term>Gene expression data</term>
<term>Gerhard</term>
<term>Gerhard weikum</term>
<term>High dynamics</term>
<term>Holger meuss</term>
<term>Index list</term>
<term>Index lists</term>
<term>Information extraction</term>
<term>Information retrieval</term>
<term>Information science</term>
<term>Information search</term>
<term>Intelligent search</term>
<term>International workshop</term>
<term>Jens graupmann</term>
<term>Keyword search</term>
<term>Large result lists</term>
<term>Latter case</term>
<term>Lise getoor</term>
<term>Lower bounds</term>
<term>Markov models</term>
<term>Martin theobald</term>
<term>Matthias bender</term>
<term>Natural language</term>
<term>Norbert fuhr</term>
<term>Ontological similarity</term>
<term>Ontology</term>
<term>Original query</term>
<term>Previous section</term>
<term>Probabilistic</term>
<term>Probabilistic guarantees</term>
<term>Query</term>
<term>Query conditions</term>
<term>Query expansion</term>
<term>Query languages</term>
<term>Query logs</term>
<term>Query processing</term>
<term>Ralf schenkel</term>
<term>Random accesses</term>
<term>Random surfer</term>
<term>Random variables</term>
<term>Relevance feedback</term>
<term>Relevant pages</term>
<term>Retrieval</term>
<term>Rich body</term>
<term>Schema</term>
<term>Score distributions</term>
<term>Scottish nobleman</term>
<term>Search engine</term>
<term>Search engines</term>
<term>Semantic</term>
<term>Semantic relationships</term>
<term>Sigir</term>
<term>Sigir workshop</term>
<term>Similarity operator</term>
<term>Springer</term>
<term>Statistical information</term>
<term>Statistical reasoning</term>
<term>Steffen staab</term>
<term>Theobald</term>
<term>Torsten schlieder</term>
<term>Total score</term>
<term>Unknown scores</term>
<term>Upper bounds</term>
<term>User</term>
<term>User behavior</term>
<term>Valuable information</term>
<term>Vldb</term>
<term>Vldb journal</term>
<term>Weikum</term>
<term>Wordnet</term>
<term>Wordnet thesaurus</term>
</keywords>
</textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: The envisioned Semantic Web aims to provide richly annotated and explicitly structured Web pages in XML, RDF, or description logics, based upon underlying ontologies and thesauri. Ideally, this should enable a wealth of query processing and semantic reasoning capabilities using XQuery and logical inference engines. However, we believe that the diversity and uncertainty of terminologies and schema-like annotations will make precise querying on a Web scale extremely elusive if not hopeless, and the same argument holds for large-scale dynamic federations of Deep Web sources. Therefore, ontology-based reasoning and querying needs to be enhanced by statistical means, leading to relevance-ranked lists as query results. This paper presents steps towards such a “statistically semantic” Web and outlines technical challenges. We discuss how statistically quantified ontological relations can be exploited in XML retrieval, how statistics can help in making Web-scale search efficient, and how statistical information extracted from users’ query logs and click streams can be leveraged for better search result ranking. We believe these are decisive issues for improving the quality of next-generation search engines for intranets, digital libraries, and the Web, and they are crucial also for peer-to-peer collaborative Web search.</div>
</front>
</TEI>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Sarre/explor/MusicSarreV3/Data/Main/Curation

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000A70 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Main/Curation/biblio.hfd -nk 000A70 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Sarre
   |area=    MusicSarreV3
   |flux=    Main
   |étape=   Curation
   |type=    RBID
   |clé=     ISTEX:E281C55D5902501C698FDE7C2BDFE0E9C38648E8
   |texte=   Towards a Statistically Semantic Web
}}

This area was generated with Dilib version V0.6.33.
Data generation: Sun Jul 15 18:16:09 2018. Site generation: Tue Mar 5 19:21:25 2024

	Serveur d'exploration sur la musique en Sarre
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration sur la musique en Sarre

Towards a Statistically Semantic Web

Towards a Statistically Semantic Web

Source :

English descriptors

Abstract

Links toward previous steps (curation, corpus...)

Links to Exploration step

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri