Serveur d'exploration sur la maladie de Parkinson

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Abstracts versus Full Texts and Patents: A Quantitative Analysis of Biomedical Entities

Identifieur interne : 000828 ( Main/Exploration ); précédent : 000827; suivant : 000829

Abstracts versus Full Texts and Patents: A Quantitative Analysis of Biomedical Entities

Auteurs : Bernd Müller [Allemagne] ; Roman Klinger [Allemagne] ; Harsha Gurulingappa [Allemagne] ; Heinz-Theodor Mevissen [Allemagne] ; Martin Hofmann-Apitius [Allemagne] ; Juliane Fluck [Allemagne] ; M. Friedrich [Allemagne]

Source :

RBID : ISTEX:4B4E58C92774A2A30C1916539C1EF0A48DC4CAFB

Abstract

Abstract: In information retrieval, named entity recognition gives the opportunity to apply semantic search in domain specific corpora. Recently, more full text patents and journal articles became freely available. As the information distribution amongst the different sections is unknown, an analysis of the diversity is of interest. This paper discovers the density and variety of relevant life science terminologies in Medline abstracts, PubMedCentral journal articles and patents from the TREC Chemistry Track. For this purpose named entity recognition for various bio, pharmaceutical, and chemical entity classes has been conducted and the frequencies and distributions in the different text zones analyzed. The full texts from PubMedCentral comprise information to a greater extent than their abstracts while containing almost all given content from their abstracts. In the patents from the TREC Chemistry Track, it is even more extrem. Especially the description section includes almost all entities mentioned in a patent and contains in comparison to the claim section at least 79 % of all entities exclusively.

Url:
DOI: 10.1007/978-3-642-13084-7_12


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Abstracts versus Full Texts and Patents: A Quantitative Analysis of Biomedical Entities</title>
<author>
<name sortKey="Muller, Bernd" sort="Muller, Bernd" uniqKey="Muller B" first="Bernd" last="Müller">Bernd Müller</name>
</author>
<author>
<name sortKey="Klinger, Roman" sort="Klinger, Roman" uniqKey="Klinger R" first="Roman" last="Klinger">Roman Klinger</name>
</author>
<author>
<name sortKey="Gurulingappa, Harsha" sort="Gurulingappa, Harsha" uniqKey="Gurulingappa H" first="Harsha" last="Gurulingappa">Harsha Gurulingappa</name>
</author>
<author>
<name sortKey="Mevissen, Heinz Theodor" sort="Mevissen, Heinz Theodor" uniqKey="Mevissen H" first="Heinz-Theodor" last="Mevissen">Heinz-Theodor Mevissen</name>
</author>
<author>
<name sortKey="Hofmann Apitius, Martin" sort="Hofmann Apitius, Martin" uniqKey="Hofmann Apitius M" first="Martin" last="Hofmann-Apitius">Martin Hofmann-Apitius</name>
</author>
<author>
<name sortKey="Fluck, Juliane" sort="Fluck, Juliane" uniqKey="Fluck J" first="Juliane" last="Fluck">Juliane Fluck</name>
</author>
<author>
<name sortKey="Friedrich, M" sort="Friedrich, M" uniqKey="Friedrich M" first="M." last="Friedrich">M. Friedrich</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:4B4E58C92774A2A30C1916539C1EF0A48DC4CAFB</idno>
<date when="2010" year="2010">2010</date>
<idno type="doi">10.1007/978-3-642-13084-7_12</idno>
<idno type="url">https://api.istex.fr/document/4B4E58C92774A2A30C1916539C1EF0A48DC4CAFB/fulltext/pdf</idno>
<idno type="wicri:Area/Main/Corpus">000A10</idno>
<idno type="wicri:Area/Main/Curation">000881</idno>
<idno type="wicri:Area/Main/Exploration">000828</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">Abstracts versus Full Texts and Patents: A Quantitative Analysis of Biomedical Entities</title>
<author>
<name sortKey="Muller, Bernd" sort="Muller, Bernd" uniqKey="Muller B" first="Bernd" last="Müller">Bernd Müller</name>
<affiliation wicri:level="3">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53754, Sankt Augustin</wicri:regionArea>
<placeName>
<region type="land" nuts="1">Rhénanie-du-Nord-Westphalie</region>
<region type="district" nuts="2">District de Cologne</region>
<settlement type="city">Sankt Augustin</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="3">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Bonn-Aachen International Center for Information Technology (B-IT), Dahlmannstraße 2, 53113, Bonn</wicri:regionArea>
<placeName>
<region type="land" nuts="1">Rhénanie-du-Nord-Westphalie</region>
<region type="district" nuts="2">District de Cologne</region>
<settlement type="city">Bonn</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Klinger, Roman" sort="Klinger, Roman" uniqKey="Klinger R" first="Roman" last="Klinger">Roman Klinger</name>
<affiliation wicri:level="3">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53754, Sankt Augustin</wicri:regionArea>
<placeName>
<region type="land" nuts="1">Rhénanie-du-Nord-Westphalie</region>
<region type="district" nuts="2">District de Cologne</region>
<settlement type="city">Sankt Augustin</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Gurulingappa, Harsha" sort="Gurulingappa, Harsha" uniqKey="Gurulingappa H" first="Harsha" last="Gurulingappa">Harsha Gurulingappa</name>
<affiliation wicri:level="3">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53754, Sankt Augustin</wicri:regionArea>
<placeName>
<region type="land" nuts="1">Rhénanie-du-Nord-Westphalie</region>
<region type="district" nuts="2">District de Cologne</region>
<settlement type="city">Sankt Augustin</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="3">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Bonn-Aachen International Center for Information Technology (B-IT), Dahlmannstraße 2, 53113, Bonn</wicri:regionArea>
<placeName>
<region type="land" nuts="1">Rhénanie-du-Nord-Westphalie</region>
<region type="district" nuts="2">District de Cologne</region>
<settlement type="city">Bonn</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Mevissen, Heinz Theodor" sort="Mevissen, Heinz Theodor" uniqKey="Mevissen H" first="Heinz-Theodor" last="Mevissen">Heinz-Theodor Mevissen</name>
<affiliation wicri:level="3">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53754, Sankt Augustin</wicri:regionArea>
<placeName>
<region type="land" nuts="1">Rhénanie-du-Nord-Westphalie</region>
<region type="district" nuts="2">District de Cologne</region>
<settlement type="city">Sankt Augustin</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Hofmann Apitius, Martin" sort="Hofmann Apitius, Martin" uniqKey="Hofmann Apitius M" first="Martin" last="Hofmann-Apitius">Martin Hofmann-Apitius</name>
<affiliation wicri:level="3">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53754, Sankt Augustin</wicri:regionArea>
<placeName>
<region type="land" nuts="1">Rhénanie-du-Nord-Westphalie</region>
<region type="district" nuts="2">District de Cologne</region>
<settlement type="city">Sankt Augustin</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="3">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Bonn-Aachen International Center for Information Technology (B-IT), Dahlmannstraße 2, 53113, Bonn</wicri:regionArea>
<placeName>
<region type="land" nuts="1">Rhénanie-du-Nord-Westphalie</region>
<region type="district" nuts="2">District de Cologne</region>
<settlement type="city">Bonn</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Fluck, Juliane" sort="Fluck, Juliane" uniqKey="Fluck J" first="Juliane" last="Fluck">Juliane Fluck</name>
<affiliation wicri:level="3">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53754, Sankt Augustin</wicri:regionArea>
<placeName>
<region type="land" nuts="1">Rhénanie-du-Nord-Westphalie</region>
<region type="district" nuts="2">District de Cologne</region>
<settlement type="city">Sankt Augustin</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Friedrich, M" sort="Friedrich, M" uniqKey="Friedrich M" first="M." last="Friedrich">M. Friedrich</name>
<affiliation wicri:level="3">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53754, Sankt Augustin</wicri:regionArea>
<placeName>
<region type="land" nuts="1">Rhénanie-du-Nord-Westphalie</region>
<region type="district" nuts="2">District de Cologne</region>
<settlement type="city">Sankt Augustin</settlement>
</placeName>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="s">Lecture Notes in Computer Science</title>
<imprint>
<date>2010</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">4B4E58C92774A2A30C1916539C1EF0A48DC4CAFB</idno>
<idno type="DOI">10.1007/978-3-642-13084-7_12</idno>
<idno type="ChapterID">Chap12</idno>
<idno type="ChapterID">12</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Abstract: In information retrieval, named entity recognition gives the opportunity to apply semantic search in domain specific corpora. Recently, more full text patents and journal articles became freely available. As the information distribution amongst the different sections is unknown, an analysis of the diversity is of interest. This paper discovers the density and variety of relevant life science terminologies in Medline abstracts, PubMedCentral journal articles and patents from the TREC Chemistry Track. For this purpose named entity recognition for various bio, pharmaceutical, and chemical entity classes has been conducted and the frequencies and distributions in the different text zones analyzed. The full texts from PubMedCentral comprise information to a greater extent than their abstracts while containing almost all given content from their abstracts. In the patents from the TREC Chemistry Track, it is even more extrem. Especially the description section includes almost all entities mentioned in a patent and contains in comparison to the claim section at least 79 % of all entities exclusively.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>Allemagne</li>
</country>
<region>
<li>District de Cologne</li>
<li>Rhénanie-du-Nord-Westphalie</li>
</region>
<settlement>
<li>Bonn</li>
<li>Sankt Augustin</li>
</settlement>
</list>
<tree>
<country name="Allemagne">
<region name="Rhénanie-du-Nord-Westphalie">
<name sortKey="Muller, Bernd" sort="Muller, Bernd" uniqKey="Muller B" first="Bernd" last="Müller">Bernd Müller</name>
</region>
<name sortKey="Fluck, Juliane" sort="Fluck, Juliane" uniqKey="Fluck J" first="Juliane" last="Fluck">Juliane Fluck</name>
<name sortKey="Friedrich, M" sort="Friedrich, M" uniqKey="Friedrich M" first="M." last="Friedrich">M. Friedrich</name>
<name sortKey="Gurulingappa, Harsha" sort="Gurulingappa, Harsha" uniqKey="Gurulingappa H" first="Harsha" last="Gurulingappa">Harsha Gurulingappa</name>
<name sortKey="Gurulingappa, Harsha" sort="Gurulingappa, Harsha" uniqKey="Gurulingappa H" first="Harsha" last="Gurulingappa">Harsha Gurulingappa</name>
<name sortKey="Hofmann Apitius, Martin" sort="Hofmann Apitius, Martin" uniqKey="Hofmann Apitius M" first="Martin" last="Hofmann-Apitius">Martin Hofmann-Apitius</name>
<name sortKey="Hofmann Apitius, Martin" sort="Hofmann Apitius, Martin" uniqKey="Hofmann Apitius M" first="Martin" last="Hofmann-Apitius">Martin Hofmann-Apitius</name>
<name sortKey="Klinger, Roman" sort="Klinger, Roman" uniqKey="Klinger R" first="Roman" last="Klinger">Roman Klinger</name>
<name sortKey="Mevissen, Heinz Theodor" sort="Mevissen, Heinz Theodor" uniqKey="Mevissen H" first="Heinz-Theodor" last="Mevissen">Heinz-Theodor Mevissen</name>
<name sortKey="Muller, Bernd" sort="Muller, Bernd" uniqKey="Muller B" first="Bernd" last="Müller">Bernd Müller</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Sante/explor/ParkinsonV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000828 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000828 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Sante
   |area=    ParkinsonV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:4B4E58C92774A2A30C1916539C1EF0A48DC4CAFB
   |texte=   Abstracts versus Full Texts and Patents: A Quantitative Analysis of Biomedical Entities
}}

Wicri

This area was generated with Dilib version V0.6.23.
Data generation: Sun Jul 3 18:06:51 2016. Site generation: Wed Mar 6 18:46:03 2024