Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Impact of Term-Indexing for Arabic Document Retrieval

Identifieur interne : 002933 ( Istex/Corpus ); précédent : 002932; suivant : 002934

Impact of Term-Indexing for Arabic Document Retrieval

Auteurs : Siham Boulaknadel

Source :

RBID : ISTEX:A73D5FA654634F250F567B70C330A496FCB968A6

Abstract

Abstract: In this paper, we adapt the standard method for multi-word term extraction for Arabic language. We define the linguistic specifications and develop a term extraction tool. We experiment the term extraction program for document retrieval in a specific domain, evaluate two kinds of multi-word term weighting functions considering either the corpus or the document, and demonstrate the efficiency of multi-word term indexing for both weighting up to 5.8% of average precision.

Url:
DOI: 10.1007/978-3-540-69858-6_49

Links to Exploration step

ISTEX:A73D5FA654634F250F567B70C330A496FCB968A6

Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Impact of Term-Indexing for Arabic Document Retrieval</title>
<author>
<name sortKey="Boulaknadel, Siham" sort="Boulaknadel, Siham" uniqKey="Boulaknadel S" first="Siham" last="Boulaknadel">Siham Boulaknadel</name>
<affiliation>
<mods:affiliation>LINA FRE CNRS 2729 Université de Nantes, 2 rue la Houssinière, BP 92208, 44322, Nantes cedex 03, France</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>GSCM Université Mohammed V, BP 1014, Agdal Rabat-Maroc</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>E-mail: siham.boulaknadel@univ-nantes.fr</mods:affiliation>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:A73D5FA654634F250F567B70C330A496FCB968A6</idno>
<date when="2008" year="2008">2008</date>
<idno type="doi">10.1007/978-3-540-69858-6_49</idno>
<idno type="url">https://api.istex.fr/document/A73D5FA654634F250F567B70C330A496FCB968A6/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">002933</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">Impact of Term-Indexing for Arabic Document Retrieval</title>
<author>
<name sortKey="Boulaknadel, Siham" sort="Boulaknadel, Siham" uniqKey="Boulaknadel S" first="Siham" last="Boulaknadel">Siham Boulaknadel</name>
<affiliation>
<mods:affiliation>LINA FRE CNRS 2729 Université de Nantes, 2 rue la Houssinière, BP 92208, 44322, Nantes cedex 03, France</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>GSCM Université Mohammed V, BP 1014, Agdal Rabat-Maroc</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>E-mail: siham.boulaknadel@univ-nantes.fr</mods:affiliation>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="s">Lecture Notes in Computer Science</title>
<imprint>
<date>2008</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">A73D5FA654634F250F567B70C330A496FCB968A6</idno>
<idno type="DOI">10.1007/978-3-540-69858-6_49</idno>
<idno type="ChapterID">49</idno>
<idno type="ChapterID">Chap49</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Abstract: In this paper, we adapt the standard method for multi-word term extraction for Arabic language. We define the linguistic specifications and develop a term extraction tool. We experiment the term extraction program for document retrieval in a specific domain, evaluate two kinds of multi-word term weighting functions considering either the corpus or the document, and demonstrate the efficiency of multi-word term indexing for both weighting up to 5.8% of average precision.</div>
</front>
</TEI>
<istex>
<corpusName>springer</corpusName>
<author>
<json:item>
<name>Siham Boulaknadel</name>
<affiliations>
<json:string>LINA FRE CNRS 2729 Université de Nantes, 2 rue la Houssinière, BP 92208, 44322, Nantes cedex 03, France</json:string>
<json:string>GSCM Université Mohammed V, BP 1014, Agdal Rabat-Maroc</json:string>
<json:string>E-mail: siham.boulaknadel@univ-nantes.fr</json:string>
</affiliations>
</json:item>
</author>
<language>
<json:string>eng</json:string>
</language>
<abstract>Abstract: In this paper, we adapt the standard method for multi-word term extraction for Arabic language. We define the linguistic specifications and develop a term extraction tool. We experiment the term extraction program for document retrieval in a specific domain, evaluate two kinds of multi-word term weighting functions considering either the corpus or the document, and demonstrate the efficiency of multi-word term indexing for both weighting up to 5.8% of average precision.</abstract>
<qualityIndicators>
<score>3.468</score>
<pdfVersion>1.6</pdfVersion>
<pdfPageSize>430 x 660 pts</pdfPageSize>
<refBibsNative>false</refBibsNative>
<keywordCount>0</keywordCount>
<abstractCharCount>484</abstractCharCount>
<pdfWordCount>1104</pdfWordCount>
<pdfCharCount>7052</pdfCharCount>
<pdfPageCount>4</pdfPageCount>
<abstractWordCount>72</abstractWordCount>
</qualityIndicators>
<title>Impact of Term-Indexing for Arabic Document Retrieval</title>
<genre.original>
<json:string>OriginalPaper</json:string>
</genre.original>
<chapterId>
<json:string>49</json:string>
<json:string>Chap49</json:string>
</chapterId>
<genre>
<json:string>conference [eBooks]</json:string>
</genre>
<serie>
<editor>
<json:item>
<name>David Hutchison</name>
</json:item>
<json:item>
<name>Takeo Kanade</name>
</json:item>
<json:item>
<name>Josef Kittler</name>
</json:item>
<json:item>
<name>Jon M. Kleinberg</name>
</json:item>
<json:item>
<name>Friedemann Mattern</name>
</json:item>
<json:item>
<name>John C. Mitchell</name>
</json:item>
<json:item>
<name>Moni Naor</name>
</json:item>
<json:item>
<name>Oscar Nierstrasz</name>
</json:item>
<json:item>
<name>C. Pandu Rangan</name>
</json:item>
<json:item>
<name>Bernhard Steffen</name>
</json:item>
<json:item>
<name>Madhu Sudan</name>
</json:item>
<json:item>
<name>Demetri Terzopoulos</name>
</json:item>
<json:item>
<name>Doug Tygar</name>
</json:item>
<json:item>
<name>Moshe Y. Vardi</name>
</json:item>
<json:item>
<name>Gerhard Weikum</name>
</json:item>
</editor>
<issn>
<json:string>0302-9743</json:string>
</issn>
<language>
<json:string>unknown</json:string>
</language>
<eissn>
<json:string>1611-3349</json:string>
</eissn>
<title>Lecture Notes in Computer Science</title>
<copyrightDate>2008</copyrightDate>
</serie>
<host>
<editor>
<json:item>
<name>Epaminondas Kapetanios</name>
</json:item>
<json:item>
<name>Vijayan Sugumaran</name>
</json:item>
<json:item>
<name>Myra Spiliopoulou</name>
</json:item>
</editor>
<subject>
<json:item>
<value>Computer Science</value>
</json:item>
<json:item>
<value>Computer Science</value>
</json:item>
<json:item>
<value>Database Management</value>
</json:item>
<json:item>
<value>Computer Communication Networks</value>
</json:item>
<json:item>
<value>Data Mining and Knowledge Discovery</value>
</json:item>
<json:item>
<value>Information Storage and Retrieval</value>
</json:item>
<json:item>
<value>Mathematical Logic and Formal Languages</value>
</json:item>
<json:item>
<value>Artificial Intelligence (incl. Robotics)</value>
</json:item>
</subject>
<isbn>
<json:string>978-3-540-69857-9</json:string>
</isbn>
<language>
<json:string>unknown</json:string>
</language>
<eissn>
<json:string>1611-3349</json:string>
</eissn>
<title>Natural Language and Information Systems</title>
<genre.original>
<json:string>Proceedings</json:string>
</genre.original>
<bookId>
<json:string>978-3-540-69858-6</json:string>
</bookId>
<volume>5039</volume>
<pages>
<last>383</last>
<first>380</first>
</pages>
<issn>
<json:string>0302-9743</json:string>
</issn>
<genre>
<json:string>Book Series</json:string>
</genre>
<eisbn>
<json:string>978-3-540-69858-6</json:string>
</eisbn>
<copyrightDate>2008</copyrightDate>
<doi>
<json:string>10.1007/978-3-540-69858-6</json:string>
</doi>
</host>
<publicationDate>2008</publicationDate>
<copyrightDate>2008</copyrightDate>
<doi>
<json:string>10.1007/978-3-540-69858-6_49</json:string>
</doi>
<id>A73D5FA654634F250F567B70C330A496FCB968A6</id>
<fulltext>
<json:item>
<original>true</original>
<mimetype>application/pdf</mimetype>
<extension>pdf</extension>
<uri>https://api.istex.fr/document/A73D5FA654634F250F567B70C330A496FCB968A6/fulltext/pdf</uri>
</json:item>
<json:item>
<original>false</original>
<mimetype>application/zip</mimetype>
<extension>zip</extension>
<uri>https://api.istex.fr/document/A73D5FA654634F250F567B70C330A496FCB968A6/fulltext/zip</uri>
</json:item>
<istex:fulltextTEI uri="https://api.istex.fr/document/A73D5FA654634F250F567B70C330A496FCB968A6/fulltext/tei">
<teiHeader>
<fileDesc>
<titleStmt>
<title level="a" type="main" xml:lang="en">Impact of Term-Indexing for Arabic Document Retrieval</title>
<respStmt xml:id="ISTEX-API" resp="Références bibliographiques récupérées via GROBID" name="ISTEX-API (INIST-CNRS)"></respStmt>
</titleStmt>
<publicationStmt>
<authority>ISTEX</authority>
<publisher>Springer Berlin Heidelberg</publisher>
<pubPlace>Berlin, Heidelberg</pubPlace>
<availability>
<p>SPRINGER</p>
</availability>
<date>2008</date>
</publicationStmt>
<sourceDesc>
<biblStruct type="inbook">
<analytic>
<title level="a" type="main" xml:lang="en">Impact of Term-Indexing for Arabic Document Retrieval</title>
<author>
<persName>
<forename type="first">Siham</forename>
<surname>Boulaknadel</surname>
</persName>
<email>siham.boulaknadel@univ-nantes.fr</email>
<affiliation>LINA FRE CNRS 2729 Université de Nantes, 2 rue la Houssinière, BP 92208, 44322, Nantes cedex 03, France</affiliation>
<affiliation>GSCM Université Mohammed V, BP 1014, Agdal Rabat-Maroc</affiliation>
</author>
</analytic>
<monogr>
<title level="m">Natural Language and Information Systems</title>
<title level="m" type="sub">13th International Conference on Applications of Natural Language to Information Systems, NLDB 2008 London, UK, June 24-27, 2008 Proceedings</title>
<idno type="pISBN">978-3-540-69857-9</idno>
<idno type="eISBN">978-3-540-69858-6</idno>
<idno type="pISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="DOI">10.1007/978-3-540-69858-6</idno>
<idno type="BookID">978-3-540-69858-6</idno>
<idno type="BookTitleID">165040</idno>
<idno type="BookSequenceNumber">5039</idno>
<idno type="BookVolumeNumber">5039</idno>
<idno type="BookChapterCount">49</idno>
<editor>
<persName>
<forename type="first">Epaminondas</forename>
<surname>Kapetanios</surname>
</persName>
</editor>
<editor>
<persName>
<forename type="first">Vijayan</forename>
<surname>Sugumaran</surname>
</persName>
</editor>
<editor>
<persName>
<forename type="first">Myra</forename>
<surname>Spiliopoulou</surname>
</persName>
</editor>
<imprint>
<publisher>Springer Berlin Heidelberg</publisher>
<pubPlace>Berlin, Heidelberg</pubPlace>
<date type="published" when="2008"></date>
<biblScope unit="volume">5039</biblScope>
<biblScope unit="page" from="380">380</biblScope>
<biblScope unit="page" to="383">383</biblScope>
</imprint>
</monogr>
<series>
<title level="s">Lecture Notes in Computer Science</title>
<editor>
<persName>
<forename type="first">David</forename>
<surname>Hutchison</surname>
</persName>
</editor>
<editor>
<persName>
<forename type="first">Takeo</forename>
<surname>Kanade</surname>
</persName>
</editor>
<editor>
<persName>
<forename type="first">Josef</forename>
<surname>Kittler</surname>
</persName>
</editor>
<editor>
<persName>
<forename type="first">Jon</forename>
<forename type="first">M.</forename>
<surname>Kleinberg</surname>
</persName>
</editor>
<editor>
<persName>
<forename type="first">Friedemann</forename>
<surname>Mattern</surname>
</persName>
</editor>
<editor>
<persName>
<forename type="first">John</forename>
<forename type="first">C.</forename>
<surname>Mitchell</surname>
</persName>
</editor>
<editor>
<persName>
<forename type="first">Moni</forename>
<surname>Naor</surname>
</persName>
</editor>
<editor>
<persName>
<forename type="first">Oscar</forename>
<surname>Nierstrasz</surname>
</persName>
</editor>
<editor>
<persName>
<forename type="first">C.</forename>
<surname>Pandu Rangan</surname>
</persName>
</editor>
<editor>
<persName>
<forename type="first">Bernhard</forename>
<surname>Steffen</surname>
</persName>
</editor>
<editor>
<persName>
<forename type="first">Madhu</forename>
<surname>Sudan</surname>
</persName>
</editor>
<editor>
<persName>
<forename type="first">Demetri</forename>
<surname>Terzopoulos</surname>
</persName>
</editor>
<editor>
<persName>
<forename type="first">Doug</forename>
<surname>Tygar</surname>
</persName>
</editor>
<editor>
<persName>
<forename type="first">Moshe</forename>
<forename type="first">Y.</forename>
<surname>Vardi</surname>
</persName>
</editor>
<editor>
<persName>
<forename type="first">Gerhard</forename>
<surname>Weikum</surname>
</persName>
</editor>
<biblScope>
<date>2008</date>
</biblScope>
<idno type="pISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="seriesId">558</idno>
</series>
<idno type="istex">A73D5FA654634F250F567B70C330A496FCB968A6</idno>
<idno type="DOI">10.1007/978-3-540-69858-6_49</idno>
<idno type="ChapterID">49</idno>
<idno type="ChapterID">Chap49</idno>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<creation>
<date>2008</date>
</creation>
<langUsage>
<language ident="en">en</language>
</langUsage>
<abstract xml:lang="en">
<p>Abstract: In this paper, we adapt the standard method for multi-word term extraction for Arabic language. We define the linguistic specifications and develop a term extraction tool. We experiment the term extraction program for document retrieval in a specific domain, evaluate two kinds of multi-word term weighting functions considering either the corpus or the document, and demonstrate the efficiency of multi-word term indexing for both weighting up to 5.8% of average precision.</p>
</abstract>
<textClass>
<keywords scheme="Book Subject Collection">
<list>
<label>SUCO11645</label>
<item>
<term>Computer Science</term>
</item>
</list>
</keywords>
</textClass>
<textClass>
<keywords scheme="Book Subject Group">
<list>
<label>I</label>
<label>I18024</label>
<label>I13022</label>
<label>I18030</label>
<label>I18032</label>
<label>I16048</label>
<label>I21017</label>
<item>
<term>Computer Science</term>
</item>
<item>
<term>Database Management</term>
</item>
<item>
<term>Computer Communication Networks</term>
</item>
<item>
<term>Data Mining and Knowledge Discovery</term>
</item>
<item>
<term>Information Storage and Retrieval</term>
</item>
<item>
<term>Mathematical Logic and Formal Languages</term>
</item>
<item>
<term>Artificial Intelligence (incl. Robotics)</term>
</item>
</list>
</keywords>
</textClass>
</profileDesc>
<revisionDesc>
<change when="2008">Published</change>
<change xml:id="refBibs-istex" who="#ISTEX-API" when="2016-3-19">References added</change>
</revisionDesc>
</teiHeader>
</istex:fulltextTEI>
<json:item>
<original>false</original>
<mimetype>text/plain</mimetype>
<extension>txt</extension>
<uri>https://api.istex.fr/document/A73D5FA654634F250F567B70C330A496FCB968A6/fulltext/txt</uri>
</json:item>
</fulltext>
<metadata>
<istex:metadataXml wicri:clean="Springer, Publisher found" wicri:toSee="no header">
<istex:xmlDeclaration>version="1.0" encoding="UTF-8"</istex:xmlDeclaration>
<istex:docType PUBLIC="-//Springer-Verlag//DTD A++ V2.4//EN" URI="http://devel.springer.de/A++/V2.4/DTD/A++V2.4.dtd" name="istex:docType"></istex:docType>
<istex:document>
<Publisher>
<PublisherInfo>
<PublisherName>Springer Berlin Heidelberg</PublisherName>
<PublisherLocation>Berlin, Heidelberg</PublisherLocation>
</PublisherInfo>
<Series>
<SeriesInfo SeriesType="Series" TocLevels="0">
<SeriesID>558</SeriesID>
<SeriesPrintISSN>0302-9743</SeriesPrintISSN>
<SeriesElectronicISSN>1611-3349</SeriesElectronicISSN>
<SeriesTitle Language="En">Lecture Notes in Computer Science</SeriesTitle>
</SeriesInfo>
<SeriesHeader>
<EditorGroup>
<Editor>
<EditorName DisplayOrder="Western">
<GivenName>David</GivenName>
<FamilyName>Hutchison</FamilyName>
</EditorName>
</Editor>
<Editor>
<EditorName DisplayOrder="Western">
<GivenName>Takeo</GivenName>
<FamilyName>Kanade</FamilyName>
</EditorName>
</Editor>
<Editor>
<EditorName DisplayOrder="Western">
<GivenName>Josef</GivenName>
<FamilyName>Kittler</FamilyName>
</EditorName>
</Editor>
<Editor>
<EditorName DisplayOrder="Western">
<GivenName>Jon</GivenName>
<GivenName>M.</GivenName>
<FamilyName>Kleinberg</FamilyName>
</EditorName>
</Editor>
<Editor>
<EditorName DisplayOrder="Western">
<GivenName>Friedemann</GivenName>
<FamilyName>Mattern</FamilyName>
</EditorName>
</Editor>
<Editor>
<EditorName DisplayOrder="Western">
<GivenName>John</GivenName>
<GivenName>C.</GivenName>
<FamilyName>Mitchell</FamilyName>
</EditorName>
</Editor>
<Editor>
<EditorName DisplayOrder="Western">
<GivenName>Moni</GivenName>
<FamilyName>Naor</FamilyName>
</EditorName>
</Editor>
<Editor>
<EditorName DisplayOrder="Western">
<GivenName>Oscar</GivenName>
<FamilyName>Nierstrasz</FamilyName>
</EditorName>
</Editor>
<Editor>
<EditorName DisplayOrder="Western">
<GivenName>C.</GivenName>
<FamilyName>Pandu Rangan</FamilyName>
</EditorName>
</Editor>
<Editor>
<EditorName DisplayOrder="Western">
<GivenName>Bernhard</GivenName>
<FamilyName>Steffen</FamilyName>
</EditorName>
</Editor>
<Editor>
<EditorName DisplayOrder="Western">
<GivenName>Madhu</GivenName>
<FamilyName>Sudan</FamilyName>
</EditorName>
</Editor>
<Editor>
<EditorName DisplayOrder="Western">
<GivenName>Demetri</GivenName>
<FamilyName>Terzopoulos</FamilyName>
</EditorName>
</Editor>
<Editor>
<EditorName DisplayOrder="Western">
<GivenName>Doug</GivenName>
<FamilyName>Tygar</FamilyName>
</EditorName>
</Editor>
<Editor>
<EditorName DisplayOrder="Western">
<GivenName>Moshe</GivenName>
<GivenName>Y.</GivenName>
<FamilyName>Vardi</FamilyName>
</EditorName>
</Editor>
<Editor>
<EditorName DisplayOrder="Western">
<GivenName>Gerhard</GivenName>
<FamilyName>Weikum</FamilyName>
</EditorName>
</Editor>
</EditorGroup>
</SeriesHeader>
<Book Language="En">
<BookInfo BookProductType="Proceedings" ContainsESM="No" Language="En" MediaType="eBook" NumberingStyle="Unnumbered" OutputMedium="All" TocLevels="0">
<BookID>978-3-540-69858-6</BookID>
<BookTitle>Natural Language and Information Systems</BookTitle>
<BookSubTitle>13th International Conference on Applications of Natural Language to Information Systems, NLDB 2008 London, UK, June 24-27, 2008 Proceedings</BookSubTitle>
<BookVolumeNumber>5039</BookVolumeNumber>
<BookSequenceNumber>5039</BookSequenceNumber>
<BookDOI>10.1007/978-3-540-69858-6</BookDOI>
<BookTitleID>165040</BookTitleID>
<BookPrintISBN>978-3-540-69857-9</BookPrintISBN>
<BookElectronicISBN>978-3-540-69858-6</BookElectronicISBN>
<BookChapterCount>49</BookChapterCount>
<BookCopyright>
<CopyrightHolderName>Springer-Verlag Berlin Heidelberg</CopyrightHolderName>
<CopyrightYear>2008</CopyrightYear>
</BookCopyright>
<BookSubjectGroup>
<BookSubject Code="I" Type="Primary">Computer Science</BookSubject>
<BookSubject Code="I18024" Priority="1" Type="Secondary">Database Management</BookSubject>
<BookSubject Code="I13022" Priority="2" Type="Secondary">Computer Communication Networks</BookSubject>
<BookSubject Code="I18030" Priority="3" Type="Secondary">Data Mining and Knowledge Discovery</BookSubject>
<BookSubject Code="I18032" Priority="4" Type="Secondary">Information Storage and Retrieval</BookSubject>
<BookSubject Code="I16048" Priority="5" Type="Secondary">Mathematical Logic and Formal Languages</BookSubject>
<BookSubject Code="I21017" Priority="6" Type="Secondary">Artificial Intelligence (incl. Robotics)</BookSubject>
<SubjectCollection Code="SUCO11645">Computer Science</SubjectCollection>
</BookSubjectGroup>
</BookInfo>
<BookHeader>
<EditorGroup>
<Editor>
<EditorName DisplayOrder="Western">
<GivenName>Epaminondas</GivenName>
<FamilyName>Kapetanios</FamilyName>
</EditorName>
<Contact>
<Email>e.kapetanios@wmin.ac.uk</Email>
</Contact>
</Editor>
<Editor>
<EditorName DisplayOrder="Western">
<GivenName>Vijayan</GivenName>
<FamilyName>Sugumaran</FamilyName>
</EditorName>
<Contact>
<Email>sugumara@oakland.edu</Email>
</Contact>
</Editor>
<Editor>
<EditorName DisplayOrder="Western">
<GivenName>Myra</GivenName>
<FamilyName>Spiliopoulou</FamilyName>
</EditorName>
<Contact>
<Email>myra@iti.cs.uni-magdeburg.de</Email>
</Contact>
</Editor>
</EditorGroup>
</BookHeader>
<Part ID="Part13">
<PartInfo TocLevels="0">
<PartID>13</PartID>
<PartSequenceNumber>13</PartSequenceNumber>
<PartTitle>Doctoral Symposium Papers</PartTitle>
<PartChapterCount>4</PartChapterCount>
<PartContext>
<SeriesID>558</SeriesID>
<BookTitle>Natural Language and Information Systems</BookTitle>
</PartContext>
</PartInfo>
<Chapter ID="Chap49" Language="En">
<ChapterInfo ChapterType="OriginalPaper" ContainsESM="No" NumberingStyle="Unnumbered" TocLevels="0">
<ChapterID>49</ChapterID>
<ChapterDOI>10.1007/978-3-540-69858-6_49</ChapterDOI>
<ChapterSequenceNumber>49</ChapterSequenceNumber>
<ChapterTitle Language="En">Impact of Term-Indexing for Arabic Document Retrieval</ChapterTitle>
<ChapterFirstPage>380</ChapterFirstPage>
<ChapterLastPage>383</ChapterLastPage>
<ChapterCopyright>
<CopyrightHolderName>Springer-Verlag Berlin Heidelberg</CopyrightHolderName>
<CopyrightYear>2008</CopyrightYear>
</ChapterCopyright>
<ChapterGrants Type="Regular">
<MetadataGrant Grant="OpenAccess"></MetadataGrant>
<AbstractGrant Grant="OpenAccess"></AbstractGrant>
<BodyPDFGrant Grant="Restricted"></BodyPDFGrant>
<BodyHTMLGrant Grant="Restricted"></BodyHTMLGrant>
<BibliographyGrant Grant="Restricted"></BibliographyGrant>
<ESMGrant Grant="Restricted"></ESMGrant>
</ChapterGrants>
<ChapterContext>
<SeriesID>558</SeriesID>
<PartID>13</PartID>
<BookID>978-3-540-69858-6</BookID>
<BookTitle>Natural Language and Information Systems</BookTitle>
</ChapterContext>
</ChapterInfo>
<ChapterHeader>
<AuthorGroup>
<Author AffiliationIDS="Aff1 Aff2">
<AuthorName DisplayOrder="Western">
<GivenName>Siham</GivenName>
<FamilyName>Boulaknadel</FamilyName>
</AuthorName>
<Contact>
<Email>siham.boulaknadel@univ-nantes.fr</Email>
</Contact>
</Author>
<Affiliation ID="Aff1">
<OrgName>LINA FRE CNRS 2729 Université de Nantes</OrgName>
<OrgAddress>
<Street>2 rue la Houssinière</Street>
<Postbox>BP 92208</Postbox>
<Postcode>44322</Postcode>
<City>Nantes cedex 03</City>
<Country>France</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff2">
<OrgName>GSCM Université Mohammed V</OrgName>
<OrgAddress>
<Postbox>BP 1014</Postbox>
<City>Agdal Rabat-Maroc</City>
</OrgAddress>
</Affiliation>
</AuthorGroup>
<Abstract ID="Abs1" Language="En">
<Heading>Abstract</Heading>
<Para>In this paper, we adapt the standard method for multi-word term extraction for Arabic language. We define the linguistic specifications and develop a term extraction tool. We experiment the term extraction program for document retrieval in a specific domain, evaluate two kinds of multi-word term weighting functions considering either the corpus or the document, and demonstrate the efficiency of multi-word term indexing for both weighting up to 5.8% of average precision.</Para>
</Abstract>
</ChapterHeader>
<NoBody></NoBody>
</Chapter>
</Part>
</Book>
</Series>
</Publisher>
</istex:document>
</istex:metadataXml>
<mods version="3.6">
<titleInfo lang="en">
<title>Impact of Term-Indexing for Arabic Document Retrieval</title>
</titleInfo>
<titleInfo type="alternative" contentType="CDATA" lang="en">
<title>Impact of Term-Indexing for Arabic Document Retrieval</title>
</titleInfo>
<name type="personal">
<namePart type="given">Siham</namePart>
<namePart type="family">Boulaknadel</namePart>
<affiliation>LINA FRE CNRS 2729 Université de Nantes, 2 rue la Houssinière, BP 92208, 44322, Nantes cedex 03, France</affiliation>
<affiliation>GSCM Université Mohammed V, BP 1014, Agdal Rabat-Maroc</affiliation>
<affiliation>E-mail: siham.boulaknadel@univ-nantes.fr</affiliation>
<role>
<roleTerm type="text">author</roleTerm>
</role>
</name>
<typeOfResource>text</typeOfResource>
<genre type="conference [eBooks]" displayLabel="OriginalPaper"></genre>
<originInfo>
<publisher>Springer Berlin Heidelberg</publisher>
<place>
<placeTerm type="text">Berlin, Heidelberg</placeTerm>
</place>
<dateIssued encoding="w3cdtf">2008</dateIssued>
<copyrightDate encoding="w3cdtf">2008</copyrightDate>
</originInfo>
<language>
<languageTerm type="code" authority="rfc3066">en</languageTerm>
<languageTerm type="code" authority="iso639-2b">eng</languageTerm>
</language>
<physicalDescription>
<internetMediaType>text/html</internetMediaType>
</physicalDescription>
<abstract lang="en">Abstract: In this paper, we adapt the standard method for multi-word term extraction for Arabic language. We define the linguistic specifications and develop a term extraction tool. We experiment the term extraction program for document retrieval in a specific domain, evaluate two kinds of multi-word term weighting functions considering either the corpus or the document, and demonstrate the efficiency of multi-word term indexing for both weighting up to 5.8% of average precision.</abstract>
<relatedItem type="host">
<titleInfo>
<title>Natural Language and Information Systems</title>
<subTitle>13th International Conference on Applications of Natural Language to Information Systems, NLDB 2008 London, UK, June 24-27, 2008 Proceedings</subTitle>
</titleInfo>
<name type="personal">
<namePart type="given">Epaminondas</namePart>
<namePart type="family">Kapetanios</namePart>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Vijayan</namePart>
<namePart type="family">Sugumaran</namePart>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Myra</namePart>
<namePart type="family">Spiliopoulou</namePart>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<genre type="Book Series" displayLabel="Proceedings"></genre>
<originInfo>
<copyrightDate encoding="w3cdtf">2008</copyrightDate>
<issuance>monographic</issuance>
</originInfo>
<subject>
<genre>Book Subject Collection</genre>
<topic authority="SpringerSubjectCodes" authorityURI="SUCO11645">Computer Science</topic>
</subject>
<subject>
<genre>Book Subject Group</genre>
<topic authority="SpringerSubjectCodes" authorityURI="I">Computer Science</topic>
<topic authority="SpringerSubjectCodes" authorityURI="I18024">Database Management</topic>
<topic authority="SpringerSubjectCodes" authorityURI="I13022">Computer Communication Networks</topic>
<topic authority="SpringerSubjectCodes" authorityURI="I18030">Data Mining and Knowledge Discovery</topic>
<topic authority="SpringerSubjectCodes" authorityURI="I18032">Information Storage and Retrieval</topic>
<topic authority="SpringerSubjectCodes" authorityURI="I16048">Mathematical Logic and Formal Languages</topic>
<topic authority="SpringerSubjectCodes" authorityURI="I21017">Artificial Intelligence (incl. Robotics)</topic>
</subject>
<identifier type="DOI">10.1007/978-3-540-69858-6</identifier>
<identifier type="ISBN">978-3-540-69857-9</identifier>
<identifier type="eISBN">978-3-540-69858-6</identifier>
<identifier type="ISSN">0302-9743</identifier>
<identifier type="eISSN">1611-3349</identifier>
<identifier type="BookTitleID">165040</identifier>
<identifier type="BookID">978-3-540-69858-6</identifier>
<identifier type="BookChapterCount">49</identifier>
<identifier type="BookVolumeNumber">5039</identifier>
<identifier type="BookSequenceNumber">5039</identifier>
<identifier type="PartChapterCount">4</identifier>
<part>
<date>2008</date>
<detail type="part">
<title>Doctoral Symposium Papers</title>
</detail>
<detail type="volume">
<number>5039</number>
<caption>vol.</caption>
</detail>
<extent unit="pages">
<start>380</start>
<end>383</end>
</extent>
</part>
<recordInfo>
<recordOrigin>Springer-Verlag Berlin Heidelberg, 2008</recordOrigin>
</recordInfo>
</relatedItem>
<relatedItem type="series">
<titleInfo>
<title>Lecture Notes in Computer Science</title>
</titleInfo>
<name type="personal">
<namePart type="given">David</namePart>
<namePart type="family">Hutchison</namePart>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Takeo</namePart>
<namePart type="family">Kanade</namePart>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Josef</namePart>
<namePart type="family">Kittler</namePart>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Jon</namePart>
<namePart type="given">M.</namePart>
<namePart type="family">Kleinberg</namePart>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Friedemann</namePart>
<namePart type="family">Mattern</namePart>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">John</namePart>
<namePart type="given">C.</namePart>
<namePart type="family">Mitchell</namePart>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Moni</namePart>
<namePart type="family">Naor</namePart>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Oscar</namePart>
<namePart type="family">Nierstrasz</namePart>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">C.</namePart>
<namePart type="family">Pandu Rangan</namePart>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Bernhard</namePart>
<namePart type="family">Steffen</namePart>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Madhu</namePart>
<namePart type="family">Sudan</namePart>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Demetri</namePart>
<namePart type="family">Terzopoulos</namePart>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Doug</namePart>
<namePart type="family">Tygar</namePart>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Moshe</namePart>
<namePart type="given">Y.</namePart>
<namePart type="family">Vardi</namePart>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Gerhard</namePart>
<namePart type="family">Weikum</namePart>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<originInfo>
<copyrightDate encoding="w3cdtf">2008</copyrightDate>
<issuance>serial</issuance>
</originInfo>
<identifier type="ISSN">0302-9743</identifier>
<identifier type="eISSN">1611-3349</identifier>
<identifier type="SeriesID">558</identifier>
<recordInfo>
<recordOrigin>Springer-Verlag Berlin Heidelberg, 2008</recordOrigin>
</recordInfo>
</relatedItem>
<identifier type="istex">A73D5FA654634F250F567B70C330A496FCB968A6</identifier>
<identifier type="DOI">10.1007/978-3-540-69858-6_49</identifier>
<identifier type="ChapterID">49</identifier>
<identifier type="ChapterID">Chap49</identifier>
<accessCondition type="use and reproduction" contentType="copyright">Springer-Verlag Berlin Heidelberg, 2008</accessCondition>
<recordInfo>
<recordContentSource>SPRINGER</recordContentSource>
<recordOrigin>Springer-Verlag Berlin Heidelberg, 2008</recordOrigin>
</recordInfo>
</mods>
</metadata>
<enrichments>
<istex:refBibTEI uri="https://api.istex.fr/document/A73D5FA654634F250F567B70C330A496FCB968A6/enrichments/refBib">
<teiHeader></teiHeader>
<text>
<front></front>
<body></body>
<back>
<listBibl>
<biblStruct xml:id="b0">
<analytic>
<title level="a" type="main">Automatic term detection: A review of current systems</title>
<author>
<persName>
<forename type="first">M</forename>
<forename type="middle">T</forename>
<surname>Cabré</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">R</forename>
<forename type="middle">E</forename>
<surname>Bagot</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">J</forename>
<forename type="middle">V</forename>
<surname>Platresi</surname>
</persName>
</author>
</analytic>
<monogr>
<title level="m">Recent Advances in Computational Terminology. Natural Language Processing</title>
<editor>Bourigault, D., Jacquemin, C., L'Homme, M.C.</editor>
<meeting>
<address>
<addrLine>Amsterdam</addrLine>
</address>
</meeting>
<imprint>
<date type="published" when="2001"></date>
<biblScope unit="page" from="53" to="88"></biblScope>
</imprint>
</monogr>
</biblStruct>
<biblStruct xml:id="b1">
<analytic>
<title level="a" type="main">Automatic tagging of arabic text: From raw text to base phrase chunks</title>
<author>
<persName>
<forename type="first">M</forename>
<surname>Diab</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">K</forename>
<surname>Hacioglu</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">D</forename>
<surname>Jurafsky</surname>
</persName>
</author>
</analytic>
<monogr>
<title level="m">Proceedings of HLT-NAACL</title>
<meeting>HLT-NAACL
<address>
<addrLine>Boston</addrLine>
</address>
</meeting>
<imprint>
<date type="published" when="2004"></date>
<biblScope unit="page" from="149" to="152"></biblScope>
</imprint>
</monogr>
</biblStruct>
<biblStruct xml:id="b2">
<analytic>
<title level="a" type="main">Accurate methods for the statistics of surprise and coincidence</title>
<author>
<persName>
<forename type="first">T</forename>
<surname>Dunning</surname>
</persName>
</author>
</analytic>
<monogr>
<title level="j">Computational Linguistics</title>
<imprint>
<biblScope unit="volume">19</biblScope>
<biblScope unit="issue">1</biblScope>
<biblScope unit="page" from="61" to="74"></biblScope>
<date type="published" when="1994"></date>
</imprint>
</monogr>
</biblStruct>
<biblStruct xml:id="b3">
<analytic>
<title level="a" type="main">Word association norms, mutual information, and lexicography</title>
<author>
<persName>
<forename type="first">W</forename>
<forename type="middle">C</forename>
<surname>Kenneth</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">P</forename>
<surname>Hanks</surname>
</persName>
</author>
</analytic>
<monogr>
<title level="m">Proceedings of the 27th. Annual Meeting of the Association for Computational Linguistics</title>
<meeting>the 27th. Annual Meeting of the Association for Computational Linguistics
<address>
<addrLine>Vancouver, B.C</addrLine>
</address>
</meeting>
<imprint>
<date type="published" when="1989"></date>
<biblScope unit="page" from="76" to="83"></biblScope>
</imprint>
</monogr>
</biblStruct>
<biblStruct xml:id="b4">
<analytic>
<title level="a" type="main">Using statistics in lexical analysis</title>
<author>
<persName>
<forename type="first">K</forename>
<surname>Church</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">W</forename>
<surname>Gale</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">P</forename>
<surname>Hanks</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">D</forename>
<surname>Hindle</surname>
</persName>
</author>
</analytic>
<monogr>
<title level="m">Lexical Acquisition: Exploiting On-Line Resources to Build a Lexicon</title>
<imprint>
<date type="published" when="1991"></date>
<biblScope unit="page" from="115" to="164"></biblScope>
</imprint>
</monogr>
</biblStruct>
<biblStruct xml:id="b5">
<analytic>
<title level="a" type="main">Methods of automatic term recognition: a review</title>
<author>
<persName>
<forename type="first">K</forename>
<surname>Kageura</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">B</forename>
<surname>Umino</surname>
</persName>
</author>
</analytic>
<monogr>
<title level="j">Terminology</title>
<imprint>
<biblScope unit="volume">3</biblScope>
<biblScope unit="issue">2</biblScope>
<biblScope unit="page" from="259" to="289"></biblScope>
<date type="published" when="1996"></date>
</imprint>
</monogr>
</biblStruct>
<biblStruct xml:id="b6">
<monogr>
<title level="m" type="main">Probabilistic Methods for Searching OCR-Degraded Arabic Text</title>
<author>
<persName>
<forename type="first">K</forename>
<surname>Darwish</surname>
</persName>
</author>
<imprint>
<date type="published" when="2003"></date>
<pubPlace>Maryland, USA</pubPlace>
</imprint>
</monogr>
</biblStruct>
</listBibl>
</back>
</text>
</istex:refBibTEI>
</enrichments>
</istex>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Istex/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 002933 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Istex/Corpus/biblio.hfd -nk 002933 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Istex
   |étape=   Corpus
   |type=    RBID
   |clé=     ISTEX:A73D5FA654634F250F567B70C330A496FCB968A6
   |texte=   Impact of Term-Indexing for Arabic Document Retrieval
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024