Serveur d'exploration autour du libre accès en Belgique

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

A machine‐learning approach to negation and speculation detection in clinical texts

Identifieur interne : 001605 ( Istex/Corpus ); précédent : 001604; suivant : 001606

A machine‐learning approach to negation and speculation detection in clinical texts

Auteurs : Noa P. Cruz Díaz ; Manuel J. Ma A L Pez ; Jacinto Mata Vázquez ; Victoria Pach N Álvarez

Source :

RBID : ISTEX:97B352EA21591E79646B653C076A9252B3C37074

Abstract

Detecting negative and speculative information is essential in most biomedical text‐mining tasks where these language forms are used to express impressions, hypotheses, or explanations of experimental results. Our research is focused on developing a system based on machine‐learning techniques that identifies negation and speculation signals and their scope in clinical texts. The proposed system works in two consecutive phases: first, a classifier decides whether each token in a sentence is a negation/speculation signal or not. Then another classifier determines, at sentence level, the tokens which are affected by the signals previously identified. The system was trained and evaluated on the clinical texts of the BioScope corpus, a freely available resource consisting of medical and biological texts: full‐length articles, scientific abstracts, and clinical reports. The results obtained by our system were compared with those of two different systems, one based on regular expressions and the other based on machine learning. Our system's results outperformed the results obtained by these two systems. In the signal detection task, the F‐score value was 97.3% in negation and 94.9% in speculation. In the scope‐finding task, a token was correctly classified if it had been properly identified as being inside or outside the scope of all the negation signals present in the sentence. Our proposal showed an F score of 93.2% in negation and 80.9% in speculation. Additionally, the percentage of correct scopes (those with all their tokens correctly classified) was evaluated obtaining F scores of 90.9% in negation and 71.9% in speculation.

Url:
DOI: 10.1002/asi.22679

Links to Exploration step

ISTEX:97B352EA21591E79646B653C076A9252B3C37074

Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">A machine‐learning approach to negation and speculation detection in clinical texts</title>
<author>
<name sortKey="Cruz Diaz, Noa P" sort="Cruz Diaz, Noa P" uniqKey="Cruz Diaz N" first="Noa P." last="Cruz Díaz">Noa P. Cruz Díaz</name>
<affiliation>
<mods:affiliation>Department of Information Technology, University of Huelva, Huelva, Spain</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>E-mail: noa.cruz@dti.uhu.es</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Ma A L Pez, Manuel J" sort="Ma A L Pez, Manuel J" uniqKey="Ma A L Pez M" first="Manuel J." last="Ma A L Pez">Manuel J. Ma A L Pez</name>
<affiliation>
<mods:affiliation>Department of Information Technology, University of Huelva, Huelva, Spain</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>E-mail: manuel.mana@dti.uhu.es</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Vazquez, Jacinto Mata" sort="Vazquez, Jacinto Mata" uniqKey="Vazquez J" first="Jacinto Mata" last="Vázquez">Jacinto Mata Vázquez</name>
<affiliation>
<mods:affiliation>Department of Information Technology, University of Huelva, Huelva, Spain</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>E-mail: jacinto.mata@dti.uhu.es</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Alvarez, Victoria Pach N" sort="Alvarez, Victoria Pach N" uniqKey="Alvarez V" first="Victoria Pach N" last="Álvarez">Victoria Pach N Álvarez</name>
<affiliation>
<mods:affiliation>Department of Information Technology, University of Huelva, Huelva, Spain</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>E-mail: victoria.pachon@dti.uhu.es</mods:affiliation>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:97B352EA21591E79646B653C076A9252B3C37074</idno>
<date when="2012" year="2012">2012</date>
<idno type="doi">10.1002/asi.22679</idno>
<idno type="url">https://api.istex.fr/document/97B352EA21591E79646B653C076A9252B3C37074/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">001605</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">A machine‐learning approach to negation and speculation detection in clinical texts</title>
<author>
<name sortKey="Cruz Diaz, Noa P" sort="Cruz Diaz, Noa P" uniqKey="Cruz Diaz N" first="Noa P." last="Cruz Díaz">Noa P. Cruz Díaz</name>
<affiliation>
<mods:affiliation>Department of Information Technology, University of Huelva, Huelva, Spain</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>E-mail: noa.cruz@dti.uhu.es</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Ma A L Pez, Manuel J" sort="Ma A L Pez, Manuel J" uniqKey="Ma A L Pez M" first="Manuel J." last="Ma A L Pez">Manuel J. Ma A L Pez</name>
<affiliation>
<mods:affiliation>Department of Information Technology, University of Huelva, Huelva, Spain</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>E-mail: manuel.mana@dti.uhu.es</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Vazquez, Jacinto Mata" sort="Vazquez, Jacinto Mata" uniqKey="Vazquez J" first="Jacinto Mata" last="Vázquez">Jacinto Mata Vázquez</name>
<affiliation>
<mods:affiliation>Department of Information Technology, University of Huelva, Huelva, Spain</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>E-mail: jacinto.mata@dti.uhu.es</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Alvarez, Victoria Pach N" sort="Alvarez, Victoria Pach N" uniqKey="Alvarez V" first="Victoria Pach N" last="Álvarez">Victoria Pach N Álvarez</name>
<affiliation>
<mods:affiliation>Department of Information Technology, University of Huelva, Huelva, Spain</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>E-mail: victoria.pachon@dti.uhu.es</mods:affiliation>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="j">Journal of the American Society for Information Science and Technology</title>
<title level="j" type="abbrev">J Am Soc Inf Sci Tec</title>
<idno type="ISSN">1532-2882</idno>
<idno type="eISSN">1532-2890</idno>
<imprint>
<publisher>Blackwell Publishing Ltd</publisher>
<date type="published" when="2012-07">2012-07</date>
<biblScope unit="volume">63</biblScope>
<biblScope unit="issue">7</biblScope>
<biblScope unit="page" from="1398">1398</biblScope>
<biblScope unit="page" to="1410">1410</biblScope>
</imprint>
<idno type="ISSN">1532-2882</idno>
</series>
<idno type="istex">97B352EA21591E79646B653C076A9252B3C37074</idno>
<idno type="DOI">10.1002/asi.22679</idno>
<idno type="ArticleID">ASI22679</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">1532-2882</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract">Detecting negative and speculative information is essential in most biomedical text‐mining tasks where these language forms are used to express impressions, hypotheses, or explanations of experimental results. Our research is focused on developing a system based on machine‐learning techniques that identifies negation and speculation signals and their scope in clinical texts. The proposed system works in two consecutive phases: first, a classifier decides whether each token in a sentence is a negation/speculation signal or not. Then another classifier determines, at sentence level, the tokens which are affected by the signals previously identified. The system was trained and evaluated on the clinical texts of the BioScope corpus, a freely available resource consisting of medical and biological texts: full‐length articles, scientific abstracts, and clinical reports. The results obtained by our system were compared with those of two different systems, one based on regular expressions and the other based on machine learning. Our system's results outperformed the results obtained by these two systems. In the signal detection task, the F‐score value was 97.3% in negation and 94.9% in speculation. In the scope‐finding task, a token was correctly classified if it had been properly identified as being inside or outside the scope of all the negation signals present in the sentence. Our proposal showed an F score of 93.2% in negation and 80.9% in speculation. Additionally, the percentage of correct scopes (those with all their tokens correctly classified) was evaluated obtaining F scores of 90.9% in negation and 71.9% in speculation.</div>
</front>
</TEI>
<istex>
<corpusName>wiley</corpusName>
<author>
<json:item>
<name>Noa P. Cruz Díaz</name>
<affiliations>
<json:string>Department of Information Technology, University of Huelva, Huelva, Spain</json:string>
<json:string>E-mail: noa.cruz@dti.uhu.es</json:string>
</affiliations>
</json:item>
<json:item>
<name>Manuel J. Maña López</name>
<affiliations>
<json:string>Department of Information Technology, University of Huelva, Huelva, Spain</json:string>
<json:string>E-mail: manuel.mana@dti.uhu.es</json:string>
</affiliations>
</json:item>
<json:item>
<name>Jacinto Mata Vázquez</name>
<affiliations>
<json:string>Department of Information Technology, University of Huelva, Huelva, Spain</json:string>
<json:string>E-mail: jacinto.mata@dti.uhu.es</json:string>
</affiliations>
</json:item>
<json:item>
<name>Victoria Pachón Álvarez</name>
<affiliations>
<json:string>Department of Information Technology, University of Huelva, Huelva, Spain</json:string>
<json:string>E-mail: victoria.pachon@dti.uhu.es</json:string>
</affiliations>
</json:item>
</author>
<subject>
<json:item>
<lang>
<json:string>eng</json:string>
</lang>
<value>machine learning</value>
</json:item>
<json:item>
<lang>
<json:string>eng</json:string>
</lang>
<value>natural language processing</value>
</json:item>
<json:item>
<lang>
<json:string>eng</json:string>
</lang>
<value>biomedical information</value>
</json:item>
</subject>
<articleId>
<json:string>ASI22679</json:string>
</articleId>
<language>
<json:string>eng</json:string>
</language>
<originalGenre>
<json:string>article</json:string>
</originalGenre>
<abstract>Detecting negative and speculative information is essential in most biomedical text‐mining tasks where these language forms are used to express impressions, hypotheses, or explanations of experimental results. Our research is focused on developing a system based on machine‐learning techniques that identifies negation and speculation signals and their scope in clinical texts. The proposed system works in two consecutive phases: first, a classifier decides whether each token in a sentence is a negation/speculation signal or not. Then another classifier determines, at sentence level, the tokens which are affected by the signals previously identified. The system was trained and evaluated on the clinical texts of the BioScope corpus, a freely available resource consisting of medical and biological texts: full‐length articles, scientific abstracts, and clinical reports. The results obtained by our system were compared with those of two different systems, one based on regular expressions and the other based on machine learning. Our system's results outperformed the results obtained by these two systems. In the signal detection task, the F‐score value was 97.3% in negation and 94.9% in speculation. In the scope‐finding task, a token was correctly classified if it had been properly identified as being inside or outside the scope of all the negation signals present in the sentence. Our proposal showed an F score of 93.2% in negation and 80.9% in speculation. Additionally, the percentage of correct scopes (those with all their tokens correctly classified) was evaluated obtaining F scores of 90.9% in negation and 71.9% in speculation.</abstract>
<qualityIndicators>
<score>8.464</score>
<pdfVersion>1.4</pdfVersion>
<pdfPageSize>612 x 792 pts (letter)</pdfPageSize>
<refBibsNative>true</refBibsNative>
<keywordCount>3</keywordCount>
<abstractCharCount>1650</abstractCharCount>
<pdfWordCount>8283</pdfWordCount>
<pdfCharCount>50832</pdfCharCount>
<pdfPageCount>13</pdfPageCount>
<abstractWordCount>247</abstractWordCount>
</qualityIndicators>
<title>A machine‐learning approach to negation and speculation detection in clinical texts</title>
<genre>
<json:string>article</json:string>
</genre>
<host>
<volume>63</volume>
<publisherId>
<json:string>ASI</json:string>
</publisherId>
<pages>
<total>13</total>
<last>1410</last>
<first>1398</first>
</pages>
<issn>
<json:string>1532-2882</json:string>
</issn>
<issue>7</issue>
<subject>
<json:item>
<value>natural language processing</value>
</json:item>
<json:item>
<value>machine learning</value>
</json:item>
<json:item>
<value>biomedical information</value>
</json:item>
<json:item>
<value>semantic analysis</value>
</json:item>
<json:item>
<value>signal boundary detection</value>
</json:item>
<json:item>
<value>RESEARCH ARTICLE</value>
</json:item>
</subject>
<genre>
<json:string>journal</json:string>
</genre>
<language>
<json:string>unknown</json:string>
</language>
<eissn>
<json:string>1532-2890</json:string>
</eissn>
<title>Journal of the American Society for Information Science and Technology</title>
<doi>
<json:string>10.1002/(ISSN)1532-2890</json:string>
</doi>
</host>
<publicationDate>2012</publicationDate>
<copyrightDate>2012</copyrightDate>
<doi>
<json:string>10.1002/asi.22679</json:string>
</doi>
<id>97B352EA21591E79646B653C076A9252B3C37074</id>
<score>0.22330087</score>
<fulltext>
<json:item>
<original>true</original>
<mimetype>application/pdf</mimetype>
<extension>pdf</extension>
<uri>https://api.istex.fr/document/97B352EA21591E79646B653C076A9252B3C37074/fulltext/pdf</uri>
</json:item>
<json:item>
<original>false</original>
<mimetype>application/zip</mimetype>
<extension>zip</extension>
<uri>https://api.istex.fr/document/97B352EA21591E79646B653C076A9252B3C37074/fulltext/zip</uri>
</json:item>
<istex:fulltextTEI uri="https://api.istex.fr/document/97B352EA21591E79646B653C076A9252B3C37074/fulltext/tei">
<teiHeader>
<fileDesc>
<titleStmt>
<title level="a" type="main" xml:lang="en">A machine‐learning approach to negation and speculation detection in clinical texts</title>
</titleStmt>
<publicationStmt>
<authority>ISTEX</authority>
<publisher>Blackwell Publishing Ltd</publisher>
<availability>
<p>© 2012 ASIS&T© 2012 ASIS&T</p>
</availability>
<date>2012-03-19</date>
</publicationStmt>
<notesStmt>
<note>Spanish Ministry of Science and Innovation</note>
<note>Spanish Government Plan E</note>
<note>European Union through ERDF - No. TIN2009‐14057‐C03‐03;</note>
</notesStmt>
<sourceDesc>
<biblStruct type="inbook">
<analytic>
<title level="a" type="main" xml:lang="en">A machine‐learning approach to negation and speculation detection in clinical texts</title>
<author xml:id="author-1">
<persName>
<forename type="first">Noa P.</forename>
<surname>Cruz Díaz</surname>
</persName>
<email>noa.cruz@dti.uhu.es</email>
<affiliation>Department of Information Technology, University of Huelva, Huelva, Spain</affiliation>
</author>
<author xml:id="author-2">
<persName>
<forename type="first">Manuel J.</forename>
<surname>Maña López</surname>
</persName>
<email>manuel.mana@dti.uhu.es</email>
<affiliation>Department of Information Technology, University of Huelva, Huelva, Spain</affiliation>
</author>
<author xml:id="author-3">
<persName>
<forename type="first">Jacinto Mata</forename>
<surname>Vázquez</surname>
</persName>
<email>jacinto.mata@dti.uhu.es</email>
<affiliation>Department of Information Technology, University of Huelva, Huelva, Spain</affiliation>
</author>
<author xml:id="author-4">
<persName>
<forename type="first">Victoria Pachón</forename>
<surname>Álvarez</surname>
</persName>
<email>victoria.pachon@dti.uhu.es</email>
<affiliation>Department of Information Technology, University of Huelva, Huelva, Spain</affiliation>
</author>
</analytic>
<monogr>
<title level="j">Journal of the American Society for Information Science and Technology</title>
<title level="j" type="abbrev">J Am Soc Inf Sci Tec</title>
<idno type="pISSN">1532-2882</idno>
<idno type="eISSN">1532-2890</idno>
<idno type="DOI">10.1002/(ISSN)1532-2890</idno>
<imprint>
<publisher>Blackwell Publishing Ltd</publisher>
<date type="published" when="2012-07"></date>
<biblScope unit="volume">63</biblScope>
<biblScope unit="issue">7</biblScope>
<biblScope unit="page" from="1398">1398</biblScope>
<biblScope unit="page" to="1410">1410</biblScope>
</imprint>
</monogr>
<idno type="istex">97B352EA21591E79646B653C076A9252B3C37074</idno>
<idno type="DOI">10.1002/asi.22679</idno>
<idno type="ArticleID">ASI22679</idno>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<creation>
<date>2012-03-19</date>
</creation>
<langUsage>
<language ident="en">en</language>
</langUsage>
<abstract>
<p>Detecting negative and speculative information is essential in most biomedical text‐mining tasks where these language forms are used to express impressions, hypotheses, or explanations of experimental results. Our research is focused on developing a system based on machine‐learning techniques that identifies negation and speculation signals and their scope in clinical texts. The proposed system works in two consecutive phases: first, a classifier decides whether each token in a sentence is a negation/speculation signal or not. Then another classifier determines, at sentence level, the tokens which are affected by the signals previously identified. The system was trained and evaluated on the clinical texts of the BioScope corpus, a freely available resource consisting of medical and biological texts: full‐length articles, scientific abstracts, and clinical reports. The results obtained by our system were compared with those of two different systems, one based on regular expressions and the other based on machine learning. Our system's results outperformed the results obtained by these two systems. In the signal detection task, the F‐score value was 97.3% in negation and 94.9% in speculation. In the scope‐finding task, a token was correctly classified if it had been properly identified as being inside or outside the scope of all the negation signals present in the sentence. Our proposal showed an F score of 93.2% in negation and 80.9% in speculation. Additionally, the percentage of correct scopes (those with all their tokens correctly classified) was evaluated obtaining F scores of 90.9% in negation and 71.9% in speculation.</p>
</abstract>
<textClass>
<keywords scheme="keyword">
<list>
<head>keywords</head>
<item>
<term>machine learning</term>
</item>
<item>
<term>natural language processing</term>
</item>
<item>
<term>biomedical information</term>
</item>
</list>
</keywords>
</textClass>
<textClass>
<keywords scheme="Journal Subject">
<list>
<head>index-terms</head>
<item>
<term>natural language processing</term>
</item>
<item>
<term>machine learning</term>
</item>
<item>
<term>biomedical information</term>
</item>
<item>
<term>semantic analysis</term>
</item>
<item>
<term>signal boundary detection</term>
</item>
</list>
</keywords>
</textClass>
<textClass>
<keywords scheme="Journal Subject">
<list>
<head>article-category</head>
<item>
<term>RESEARCH ARTICLE</term>
</item>
</list>
</keywords>
</textClass>
</profileDesc>
<revisionDesc>
<change when="2011-03-02">Received</change>
<change when="2012-02-23">Registration</change>
<change when="2012-03-19">Created</change>
<change when="2012-07">Published</change>
</revisionDesc>
</teiHeader>
</istex:fulltextTEI>
<json:item>
<original>false</original>
<mimetype>text/plain</mimetype>
<extension>txt</extension>
<uri>https://api.istex.fr/document/97B352EA21591E79646B653C076A9252B3C37074/fulltext/txt</uri>
</json:item>
</fulltext>
<metadata>
<istex:metadataXml wicri:clean="Wiley, elements deleted: body">
<istex:xmlDeclaration>version="1.0" encoding="UTF-8" standalone="yes"</istex:xmlDeclaration>
<istex:document>
<component type="serialArticle" version="2.0" xml:id="ASI22679" xml:lang="en">
<header>
<publicationMeta level="product">
<doi origin="wiley">10.1002/(ISSN)1532-2890</doi>
<issn type="print">1532-2882</issn>
<issn type="electronic">1532-2890</issn>
<idGroup>
<id type="product" value="ASI"></id>
</idGroup>
<titleGroup>
<title sort="JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY" type="main">Journal of the American Society for Information Science and Technology</title>
<title type="short">J Am Soc Inf Sci Tec</title>
</titleGroup>
</publicationMeta>
<publicationMeta level="part" position="07107">
<doi origin="wiley">10.1002/asi.v63.7</doi>
<copyright ownership="thirdParty">© 2012 ASIS&T</copyright>
<numberingGroup>
<numbering number="63" type="journalVolume">63</numbering>
<numbering type="journalIssue">7</numbering>
</numberingGroup>
<coverDate startDate="2012-07">July 2012</coverDate>
</publicationMeta>
<publicationMeta level="unit" position="110" status="forIssue" type="article">
<doi>10.1002/asi.22679</doi>
<idGroup>
<id type="unit" value="ASI22679"></id>
</idGroup>
<countGroup>
<count number="13" type="pageTotal"></count>
</countGroup>
<titleGroup>
<title type="tocHeading1">RESEARCH ARTICLES</title>
<title type="articleCategory">RESEARCH ARTICLE</title>
</titleGroup>
<copyright ownership="thirdParty">© 2012 ASIS&T</copyright>
<eventGroup>
<event agent="bestset" date="2012-03-19" type="xmlCreated"></event>
<event date="2011-03-02" type="manuscriptReceived"></event>
<event date="2012-02-23" type="manuscriptRevised"></event>
<event date="2012-02-23" type="manuscriptAccepted"></event>
<event type="publishedOnlineEarlyUnpaginated" date="2012-05-31"></event>
<event type="firstOnline" date="2012-05-31"></event>
<event type="publishedOnlineFinalForm" date="2012-06-14"></event>
<event type="xmlConverted" agent="Converter:WILEY_ML3G_TO_WILEY_ML3GV2 version:3.8.8" date="2014-01-06"></event>
<event type="xmlConverted" agent="Converter:WML3G_To_WML3G version:4.3.4 mode:FullText" date="2015-02-24"></event>
</eventGroup>
<numberingGroup>
<numbering type="pageFirst">1398</numbering>
<numbering type="pageLast">1410</numbering>
</numberingGroup>
<subjectInfo>
<subject href="http://psi.asis.org/digital/natural+language+processing">natural language processing</subject>
<subject href="http://psi.asis.org/digital/machine+learning">machine learning</subject>
<subject href="http://psi.asis.org/digital/biomedical+information">biomedical information</subject>
<subject href="http://psi.asis.org/digital/semantic+analysis">semantic analysis</subject>
<subject href="http://psi.asis.org/digital/signal+boundary+detection">signal boundary detection</subject>
</subjectInfo>
<linkGroup>
<link type="toTypesetVersion" href="file:ASI.ASI22679.pdf"></link>
</linkGroup>
</publicationMeta>
<contentMeta>
<titleGroup>
<title type="main">A machine‐learning approach to negation and speculation detection in clinical texts</title>
</titleGroup>
<creators>
<creator affiliationRef="#asi22679-aff-0001" creatorRole="author" xml:id="asi22679-cr-0001">
<personName>
<givenNames>Noa P.</givenNames>
<familyName>Cruz Díaz</familyName>
</personName>
<contactDetails>
<email>noa.cruz@dti.uhu.es</email>
</contactDetails>
</creator>
<creator affiliationRef="#asi22679-aff-0001" creatorRole="author" xml:id="asi22679-cr-0002">
<personName>
<givenNames>Manuel J.</givenNames>
<familyName>Maña López</familyName>
</personName>
<contactDetails>
<email>manuel.mana@dti.uhu.es</email>
</contactDetails>
</creator>
<creator affiliationRef="#asi22679-aff-0001" creatorRole="author" xml:id="asi22679-cr-0003">
<personName>
<givenNames>Jacinto Mata</givenNames>
<familyName>Vázquez</familyName>
</personName>
<contactDetails>
<email>jacinto.mata@dti.uhu.es</email>
</contactDetails>
</creator>
<creator affiliationRef="#asi22679-aff-0001" creatorRole="author" xml:id="asi22679-cr-0004">
<personName>
<givenNames>Victoria Pachón</givenNames>
<familyName>Álvarez</familyName>
</personName>
<contactDetails>
<email>victoria.pachon@dti.uhu.es</email>
</contactDetails>
</creator>
</creators>
<affiliationGroup>
<affiliation countryCode="ES" xml:id="asi22679-aff-0001">
<orgDiv>Department of Information Technology</orgDiv>
<orgName>University of Huelva</orgName>
<address>
<city>Huelva</city>
<country>Spain</country>
</address>
</affiliation>
</affiliationGroup>
<keywordGroup type="author">
<keyword xml:id="asi22679-kwd-0001">machine learning</keyword>
<keyword xml:id="asi22679-kwd-0002">natural language processing</keyword>
<keyword xml:id="asi22679-kwd-0003">biomedical information</keyword>
</keywordGroup>
<fundingInfo>
<fundingAgency>Spanish Ministry of Science and Innovation</fundingAgency>
</fundingInfo>
<fundingInfo>
<fundingAgency>Spanish Government Plan E</fundingAgency>
</fundingInfo>
<fundingInfo>
<fundingAgency>European Union through ERDF</fundingAgency>
<fundingNumber>TIN2009‐14057‐C03‐03</fundingNumber>
</fundingInfo>
<abstractGroup>
<abstract type="main">
<p>Detecting negative and speculative information is essential in most biomedical text‐mining tasks where these language forms are used to express impressions, hypotheses, or explanations of experimental results. Our research is focused on developing a system based on machine‐learning techniques that identifies negation and speculation signals and their scope in clinical texts. The proposed system works in two consecutive phases: first, a classifier decides whether each token in a sentence is a negation/speculation signal or not. Then another classifier determines, at sentence level, the tokens which are affected by the signals previously identified. The system was trained and evaluated on the clinical texts of the
<fc>B</fc>
io
<fc>S</fc>
cope corpus, a freely available resource consisting of medical and biological texts: full‐length articles, scientific abstracts, and clinical reports. The results obtained by our system were compared with those of two different systems, one based on regular expressions and the other based on machine learning. Our system's results outperformed the results obtained by these two systems. In the signal detection task, the
<i>
<fc>F</fc>
</i>
‐score value was 97.3% in negation and 94.9% in speculation. In the scope‐finding task, a token was correctly classified if it had been properly identified as being inside or outside the scope of all the negation signals present in the sentence. Our proposal showed an
<i>
<fc>F</fc>
</i>
score of 93.2% in negation and 80.9% in speculation. Additionally, the percentage of correct scopes (those with all their tokens correctly classified) was evaluated obtaining
<i>
<fc>F</fc>
</i>
scores of 90.9% in negation and 71.9% in speculation.</p>
</abstract>
</abstractGroup>
</contentMeta>
</header>
</component>
</istex:document>
</istex:metadataXml>
<mods version="3.6">
<titleInfo lang="en">
<title>A machine‐learning approach to negation and speculation detection in clinical texts</title>
</titleInfo>
<titleInfo type="alternative" contentType="CDATA" lang="en">
<title>A machine‐learning approach to negation and speculation detection in clinical texts</title>
</titleInfo>
<name type="personal">
<namePart type="given">Noa P.</namePart>
<namePart type="family">Cruz Díaz</namePart>
<affiliation>Department of Information Technology, University of Huelva, Huelva, Spain</affiliation>
<affiliation>E-mail: noa.cruz@dti.uhu.es</affiliation>
<role>
<roleTerm type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Manuel J.</namePart>
<namePart type="family">Maña López</namePart>
<affiliation>Department of Information Technology, University of Huelva, Huelva, Spain</affiliation>
<affiliation>E-mail: manuel.mana@dti.uhu.es</affiliation>
<role>
<roleTerm type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Jacinto Mata</namePart>
<namePart type="family">Vázquez</namePart>
<affiliation>Department of Information Technology, University of Huelva, Huelva, Spain</affiliation>
<affiliation>E-mail: jacinto.mata@dti.uhu.es</affiliation>
<role>
<roleTerm type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Victoria Pachón</namePart>
<namePart type="family">Álvarez</namePart>
<affiliation>Department of Information Technology, University of Huelva, Huelva, Spain</affiliation>
<affiliation>E-mail: victoria.pachon@dti.uhu.es</affiliation>
<role>
<roleTerm type="text">author</roleTerm>
</role>
</name>
<typeOfResource>text</typeOfResource>
<genre type="article" displayLabel="article"></genre>
<originInfo>
<publisher>Blackwell Publishing Ltd</publisher>
<dateIssued encoding="w3cdtf">2012-07</dateIssued>
<dateCreated encoding="w3cdtf">2012-03-19</dateCreated>
<dateCaptured encoding="w3cdtf">2011-03-02</dateCaptured>
<dateValid encoding="w3cdtf">2012-02-23</dateValid>
<copyrightDate encoding="w3cdtf">2012</copyrightDate>
</originInfo>
<language>
<languageTerm type="code" authority="rfc3066">en</languageTerm>
<languageTerm type="code" authority="iso639-2b">eng</languageTerm>
</language>
<physicalDescription>
<internetMediaType>text/html</internetMediaType>
</physicalDescription>
<abstract>Detecting negative and speculative information is essential in most biomedical text‐mining tasks where these language forms are used to express impressions, hypotheses, or explanations of experimental results. Our research is focused on developing a system based on machine‐learning techniques that identifies negation and speculation signals and their scope in clinical texts. The proposed system works in two consecutive phases: first, a classifier decides whether each token in a sentence is a negation/speculation signal or not. Then another classifier determines, at sentence level, the tokens which are affected by the signals previously identified. The system was trained and evaluated on the clinical texts of the BioScope corpus, a freely available resource consisting of medical and biological texts: full‐length articles, scientific abstracts, and clinical reports. The results obtained by our system were compared with those of two different systems, one based on regular expressions and the other based on machine learning. Our system's results outperformed the results obtained by these two systems. In the signal detection task, the F‐score value was 97.3% in negation and 94.9% in speculation. In the scope‐finding task, a token was correctly classified if it had been properly identified as being inside or outside the scope of all the negation signals present in the sentence. Our proposal showed an F score of 93.2% in negation and 80.9% in speculation. Additionally, the percentage of correct scopes (those with all their tokens correctly classified) was evaluated obtaining F scores of 90.9% in negation and 71.9% in speculation.</abstract>
<note type="funding">Spanish Ministry of Science and Innovation</note>
<note type="funding">Spanish Government Plan E</note>
<note type="funding">European Union through ERDF - No. TIN2009‐14057‐C03‐03; </note>
<subject>
<genre>keywords</genre>
<topic>machine learning</topic>
<topic>natural language processing</topic>
<topic>biomedical information</topic>
</subject>
<relatedItem type="host">
<titleInfo>
<title>Journal of the American Society for Information Science and Technology</title>
</titleInfo>
<titleInfo type="abbreviated">
<title>J Am Soc Inf Sci Tec</title>
</titleInfo>
<genre type="journal">journal</genre>
<subject>
<genre>index-terms</genre>
<topic authorityURI="http://psi.asis.org/digital/natural+language+processing">natural language processing</topic>
<topic authorityURI="http://psi.asis.org/digital/machine+learning">machine learning</topic>
<topic authorityURI="http://psi.asis.org/digital/biomedical+information">biomedical information</topic>
<topic authorityURI="http://psi.asis.org/digital/semantic+analysis">semantic analysis</topic>
<topic authorityURI="http://psi.asis.org/digital/signal+boundary+detection">signal boundary detection</topic>
</subject>
<subject>
<genre>article-category</genre>
<topic>RESEARCH ARTICLE</topic>
</subject>
<identifier type="ISSN">1532-2882</identifier>
<identifier type="eISSN">1532-2890</identifier>
<identifier type="DOI">10.1002/(ISSN)1532-2890</identifier>
<identifier type="PublisherID">ASI</identifier>
<part>
<date>2012</date>
<detail type="volume">
<caption>vol.</caption>
<number>63</number>
</detail>
<detail type="issue">
<caption>no.</caption>
<number>7</number>
</detail>
<extent unit="pages">
<start>1398</start>
<end>1410</end>
<total>13</total>
</extent>
</part>
</relatedItem>
<identifier type="istex">97B352EA21591E79646B653C076A9252B3C37074</identifier>
<identifier type="DOI">10.1002/asi.22679</identifier>
<identifier type="ArticleID">ASI22679</identifier>
<accessCondition type="use and reproduction" contentType="copyright">© 2012 ASIS&T© 2012 ASIS&T</accessCondition>
<recordInfo>
<recordContentSource>WILEY</recordContentSource>
</recordInfo>
</mods>
</metadata>
<enrichments>
<json:item>
<type>multicat</type>
<uri>https://api.istex.fr/document/97B352EA21591E79646B653C076A9252B3C37074/enrichments/multicat</uri>
</json:item>
</enrichments>
<serie></serie>
</istex>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Belgique/explor/OpenAccessBelV2/Data/Istex/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001605 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Istex/Corpus/biblio.hfd -nk 001605 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Belgique
   |area=    OpenAccessBelV2
   |flux=    Istex
   |étape=   Corpus
   |type=    RBID
   |clé=     ISTEX:97B352EA21591E79646B653C076A9252B3C37074
   |texte=   A machine‐learning approach to negation and speculation detection in clinical texts
}}

Wicri

This area was generated with Dilib version V0.6.25.
Data generation: Thu Dec 1 00:43:49 2016. Site generation: Wed Mar 6 14:51:30 2024