Impact of Term-Indexing for Arabic Document Retrieval
Identifieur interne : 002933 ( Istex/Corpus ); précédent : 002932; suivant : 002934Impact of Term-Indexing for Arabic Document Retrieval
Auteurs : Siham BoulaknadelSource :
- Lecture Notes in Computer Science [ 0302-9743 ] ; 2008.
Abstract
Abstract: In this paper, we adapt the standard method for multi-word term extraction for Arabic language. We define the linguistic specifications and develop a term extraction tool. We experiment the term extraction program for document retrieval in a specific domain, evaluate two kinds of multi-word term weighting functions considering either the corpus or the document, and demonstrate the efficiency of multi-word term indexing for both weighting up to 5.8% of average precision.
Url:
DOI: 10.1007/978-3-540-69858-6_49
Links to Exploration step
ISTEX:A73D5FA654634F250F567B70C330A496FCB968A6Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Impact of Term-Indexing for Arabic Document Retrieval</title>
<author><name sortKey="Boulaknadel, Siham" sort="Boulaknadel, Siham" uniqKey="Boulaknadel S" first="Siham" last="Boulaknadel">Siham Boulaknadel</name>
<affiliation><mods:affiliation>LINA FRE CNRS 2729 Université de Nantes, 2 rue la Houssinière, BP 92208, 44322, Nantes cedex 03, France</mods:affiliation>
</affiliation>
<affiliation><mods:affiliation>GSCM Université Mohammed V, BP 1014, Agdal Rabat-Maroc</mods:affiliation>
</affiliation>
<affiliation><mods:affiliation>E-mail: siham.boulaknadel@univ-nantes.fr</mods:affiliation>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:A73D5FA654634F250F567B70C330A496FCB968A6</idno>
<date when="2008" year="2008">2008</date>
<idno type="doi">10.1007/978-3-540-69858-6_49</idno>
<idno type="url">https://api.istex.fr/document/A73D5FA654634F250F567B70C330A496FCB968A6/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">002933</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Impact of Term-Indexing for Arabic Document Retrieval</title>
<author><name sortKey="Boulaknadel, Siham" sort="Boulaknadel, Siham" uniqKey="Boulaknadel S" first="Siham" last="Boulaknadel">Siham Boulaknadel</name>
<affiliation><mods:affiliation>LINA FRE CNRS 2729 Université de Nantes, 2 rue la Houssinière, BP 92208, 44322, Nantes cedex 03, France</mods:affiliation>
</affiliation>
<affiliation><mods:affiliation>GSCM Université Mohammed V, BP 1014, Agdal Rabat-Maroc</mods:affiliation>
</affiliation>
<affiliation><mods:affiliation>E-mail: siham.boulaknadel@univ-nantes.fr</mods:affiliation>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2008</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">A73D5FA654634F250F567B70C330A496FCB968A6</idno>
<idno type="DOI">10.1007/978-3-540-69858-6_49</idno>
<idno type="ChapterID">49</idno>
<idno type="ChapterID">Chap49</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: In this paper, we adapt the standard method for multi-word term extraction for Arabic language. We define the linguistic specifications and develop a term extraction tool. We experiment the term extraction program for document retrieval in a specific domain, evaluate two kinds of multi-word term weighting functions considering either the corpus or the document, and demonstrate the efficiency of multi-word term indexing for both weighting up to 5.8% of average precision.</div>
</front>
</TEI>
<istex><corpusName>springer</corpusName>
<author><json:item><name>Siham Boulaknadel</name>
<affiliations><json:string>LINA FRE CNRS 2729 Université de Nantes, 2 rue la Houssinière, BP 92208, 44322, Nantes cedex 03, France</json:string>
<json:string>GSCM Université Mohammed V, BP 1014, Agdal Rabat-Maroc</json:string>
<json:string>E-mail: siham.boulaknadel@univ-nantes.fr</json:string>
</affiliations>
</json:item>
</author>
<language><json:string>eng</json:string>
</language>
<abstract>Abstract: In this paper, we adapt the standard method for multi-word term extraction for Arabic language. We define the linguistic specifications and develop a term extraction tool. We experiment the term extraction program for document retrieval in a specific domain, evaluate two kinds of multi-word term weighting functions considering either the corpus or the document, and demonstrate the efficiency of multi-word term indexing for both weighting up to 5.8% of average precision.</abstract>
<qualityIndicators><score>3.468</score>
<pdfVersion>1.6</pdfVersion>
<pdfPageSize>430 x 660 pts</pdfPageSize>
<refBibsNative>false</refBibsNative>
<keywordCount>0</keywordCount>
<abstractCharCount>484</abstractCharCount>
<pdfWordCount>1104</pdfWordCount>
<pdfCharCount>7052</pdfCharCount>
<pdfPageCount>4</pdfPageCount>
<abstractWordCount>72</abstractWordCount>
</qualityIndicators>
<title>Impact of Term-Indexing for Arabic Document Retrieval</title>
<genre.original><json:string>OriginalPaper</json:string>
</genre.original>
<chapterId><json:string>49</json:string>
<json:string>Chap49</json:string>
</chapterId>
<genre><json:string>conference [eBooks]</json:string>
</genre>
<serie><editor><json:item><name>David Hutchison</name>
</json:item>
<json:item><name>Takeo Kanade</name>
</json:item>
<json:item><name>Josef Kittler</name>
</json:item>
<json:item><name>Jon M. Kleinberg</name>
</json:item>
<json:item><name>Friedemann Mattern</name>
</json:item>
<json:item><name>John C. Mitchell</name>
</json:item>
<json:item><name>Moni Naor</name>
</json:item>
<json:item><name>Oscar Nierstrasz</name>
</json:item>
<json:item><name>C. Pandu Rangan</name>
</json:item>
<json:item><name>Bernhard Steffen</name>
</json:item>
<json:item><name>Madhu Sudan</name>
</json:item>
<json:item><name>Demetri Terzopoulos</name>
</json:item>
<json:item><name>Doug Tygar</name>
</json:item>
<json:item><name>Moshe Y. Vardi</name>
</json:item>
<json:item><name>Gerhard Weikum</name>
</json:item>
</editor>
<issn><json:string>0302-9743</json:string>
</issn>
<language><json:string>unknown</json:string>
</language>
<eissn><json:string>1611-3349</json:string>
</eissn>
<title>Lecture Notes in Computer Science</title>
<copyrightDate>2008</copyrightDate>
</serie>
<host><editor><json:item><name>Epaminondas Kapetanios</name>
</json:item>
<json:item><name>Vijayan Sugumaran</name>
</json:item>
<json:item><name>Myra Spiliopoulou</name>
</json:item>
</editor>
<subject><json:item><value>Computer Science</value>
</json:item>
<json:item><value>Computer Science</value>
</json:item>
<json:item><value>Database Management</value>
</json:item>
<json:item><value>Computer Communication Networks</value>
</json:item>
<json:item><value>Data Mining and Knowledge Discovery</value>
</json:item>
<json:item><value>Information Storage and Retrieval</value>
</json:item>
<json:item><value>Mathematical Logic and Formal Languages</value>
</json:item>
<json:item><value>Artificial Intelligence (incl. Robotics)</value>
</json:item>
</subject>
<isbn><json:string>978-3-540-69857-9</json:string>
</isbn>
<language><json:string>unknown</json:string>
</language>
<eissn><json:string>1611-3349</json:string>
</eissn>
<title>Natural Language and Information Systems</title>
<genre.original><json:string>Proceedings</json:string>
</genre.original>
<bookId><json:string>978-3-540-69858-6</json:string>
</bookId>
<volume>5039</volume>
<pages><last>383</last>
<first>380</first>
</pages>
<issn><json:string>0302-9743</json:string>
</issn>
<genre><json:string>Book Series</json:string>
</genre>
<eisbn><json:string>978-3-540-69858-6</json:string>
</eisbn>
<copyrightDate>2008</copyrightDate>
<doi><json:string>10.1007/978-3-540-69858-6</json:string>
</doi>
</host>
<publicationDate>2008</publicationDate>
<copyrightDate>2008</copyrightDate>
<doi><json:string>10.1007/978-3-540-69858-6_49</json:string>
</doi>
<id>A73D5FA654634F250F567B70C330A496FCB968A6</id>
<fulltext><json:item><original>true</original>
<mimetype>application/pdf</mimetype>
<extension>pdf</extension>
<uri>https://api.istex.fr/document/A73D5FA654634F250F567B70C330A496FCB968A6/fulltext/pdf</uri>
</json:item>
<json:item><original>false</original>
<mimetype>application/zip</mimetype>
<extension>zip</extension>
<uri>https://api.istex.fr/document/A73D5FA654634F250F567B70C330A496FCB968A6/fulltext/zip</uri>
</json:item>
<istex:fulltextTEI uri="https://api.istex.fr/document/A73D5FA654634F250F567B70C330A496FCB968A6/fulltext/tei"><teiHeader><fileDesc><titleStmt><title level="a" type="main" xml:lang="en">Impact of Term-Indexing for Arabic Document Retrieval</title>
<respStmt xml:id="ISTEX-API" resp="Références bibliographiques récupérées via GROBID" name="ISTEX-API (INIST-CNRS)"></respStmt>
</titleStmt>
<publicationStmt><authority>ISTEX</authority>
<publisher>Springer Berlin Heidelberg</publisher>
<pubPlace>Berlin, Heidelberg</pubPlace>
<availability><p>SPRINGER</p>
</availability>
<date>2008</date>
</publicationStmt>
<sourceDesc><biblStruct type="inbook"><analytic><title level="a" type="main" xml:lang="en">Impact of Term-Indexing for Arabic Document Retrieval</title>
<author><persName><forename type="first">Siham</forename>
<surname>Boulaknadel</surname>
</persName>
<email>siham.boulaknadel@univ-nantes.fr</email>
<affiliation>LINA FRE CNRS 2729 Université de Nantes, 2 rue la Houssinière, BP 92208, 44322, Nantes cedex 03, France</affiliation>
<affiliation>GSCM Université Mohammed V, BP 1014, Agdal Rabat-Maroc</affiliation>
</author>
</analytic>
<monogr><title level="m">Natural Language and Information Systems</title>
<title level="m" type="sub">13th International Conference on Applications of Natural Language to Information Systems, NLDB 2008 London, UK, June 24-27, 2008 Proceedings</title>
<idno type="pISBN">978-3-540-69857-9</idno>
<idno type="eISBN">978-3-540-69858-6</idno>
<idno type="pISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="DOI">10.1007/978-3-540-69858-6</idno>
<idno type="BookID">978-3-540-69858-6</idno>
<idno type="BookTitleID">165040</idno>
<idno type="BookSequenceNumber">5039</idno>
<idno type="BookVolumeNumber">5039</idno>
<idno type="BookChapterCount">49</idno>
<editor><persName><forename type="first">Epaminondas</forename>
<surname>Kapetanios</surname>
</persName>
</editor>
<editor><persName><forename type="first">Vijayan</forename>
<surname>Sugumaran</surname>
</persName>
</editor>
<editor><persName><forename type="first">Myra</forename>
<surname>Spiliopoulou</surname>
</persName>
</editor>
<imprint><publisher>Springer Berlin Heidelberg</publisher>
<pubPlace>Berlin, Heidelberg</pubPlace>
<date type="published" when="2008"></date>
<biblScope unit="volume">5039</biblScope>
<biblScope unit="page" from="380">380</biblScope>
<biblScope unit="page" to="383">383</biblScope>
</imprint>
</monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<editor><persName><forename type="first">David</forename>
<surname>Hutchison</surname>
</persName>
</editor>
<editor><persName><forename type="first">Takeo</forename>
<surname>Kanade</surname>
</persName>
</editor>
<editor><persName><forename type="first">Josef</forename>
<surname>Kittler</surname>
</persName>
</editor>
<editor><persName><forename type="first">Jon</forename>
<forename type="first">M.</forename>
<surname>Kleinberg</surname>
</persName>
</editor>
<editor><persName><forename type="first">Friedemann</forename>
<surname>Mattern</surname>
</persName>
</editor>
<editor><persName><forename type="first">John</forename>
<forename type="first">C.</forename>
<surname>Mitchell</surname>
</persName>
</editor>
<editor><persName><forename type="first">Moni</forename>
<surname>Naor</surname>
</persName>
</editor>
<editor><persName><forename type="first">Oscar</forename>
<surname>Nierstrasz</surname>
</persName>
</editor>
<editor><persName><forename type="first">C.</forename>
<surname>Pandu Rangan</surname>
</persName>
</editor>
<editor><persName><forename type="first">Bernhard</forename>
<surname>Steffen</surname>
</persName>
</editor>
<editor><persName><forename type="first">Madhu</forename>
<surname>Sudan</surname>
</persName>
</editor>
<editor><persName><forename type="first">Demetri</forename>
<surname>Terzopoulos</surname>
</persName>
</editor>
<editor><persName><forename type="first">Doug</forename>
<surname>Tygar</surname>
</persName>
</editor>
<editor><persName><forename type="first">Moshe</forename>
<forename type="first">Y.</forename>
<surname>Vardi</surname>
</persName>
</editor>
<editor><persName><forename type="first">Gerhard</forename>
<surname>Weikum</surname>
</persName>
</editor>
<biblScope><date>2008</date>
</biblScope>
<idno type="pISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="seriesId">558</idno>
</series>
<idno type="istex">A73D5FA654634F250F567B70C330A496FCB968A6</idno>
<idno type="DOI">10.1007/978-3-540-69858-6_49</idno>
<idno type="ChapterID">49</idno>
<idno type="ChapterID">Chap49</idno>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><creation><date>2008</date>
</creation>
<langUsage><language ident="en">en</language>
</langUsage>
<abstract xml:lang="en"><p>Abstract: In this paper, we adapt the standard method for multi-word term extraction for Arabic language. We define the linguistic specifications and develop a term extraction tool. We experiment the term extraction program for document retrieval in a specific domain, evaluate two kinds of multi-word term weighting functions considering either the corpus or the document, and demonstrate the efficiency of multi-word term indexing for both weighting up to 5.8% of average precision.</p>
</abstract>
<textClass><keywords scheme="Book Subject Collection"><list><label>SUCO11645</label>
<item><term>Computer Science</term>
</item>
</list>
</keywords>
</textClass>
<textClass><keywords scheme="Book Subject Group"><list><label>I</label>
<label>I18024</label>
<label>I13022</label>
<label>I18030</label>
<label>I18032</label>
<label>I16048</label>
<label>I21017</label>
<item><term>Computer Science</term>
</item>
<item><term>Database Management</term>
</item>
<item><term>Computer Communication Networks</term>
</item>
<item><term>Data Mining and Knowledge Discovery</term>
</item>
<item><term>Information Storage and Retrieval</term>
</item>
<item><term>Mathematical Logic and Formal Languages</term>
</item>
<item><term>Artificial Intelligence (incl. Robotics)</term>
</item>
</list>
</keywords>
</textClass>
</profileDesc>
<revisionDesc><change when="2008">Published</change>
<change xml:id="refBibs-istex" who="#ISTEX-API" when="2016-3-19">References added</change>
</revisionDesc>
</teiHeader>
</istex:fulltextTEI>
<json:item><original>false</original>
<mimetype>text/plain</mimetype>
<extension>txt</extension>
<uri>https://api.istex.fr/document/A73D5FA654634F250F567B70C330A496FCB968A6/fulltext/txt</uri>
</json:item>
</fulltext>
<metadata><istex:metadataXml wicri:clean="Springer, Publisher found" wicri:toSee="no header"><istex:xmlDeclaration>version="1.0" encoding="UTF-8"</istex:xmlDeclaration>
<istex:docType PUBLIC="-//Springer-Verlag//DTD A++ V2.4//EN" URI="http://devel.springer.de/A++/V2.4/DTD/A++V2.4.dtd" name="istex:docType"></istex:docType>
<istex:document><Publisher><PublisherInfo><PublisherName>Springer Berlin Heidelberg</PublisherName>
<PublisherLocation>Berlin, Heidelberg</PublisherLocation>
</PublisherInfo>
<Series><SeriesInfo SeriesType="Series" TocLevels="0"><SeriesID>558</SeriesID>
<SeriesPrintISSN>0302-9743</SeriesPrintISSN>
<SeriesElectronicISSN>1611-3349</SeriesElectronicISSN>
<SeriesTitle Language="En">Lecture Notes in Computer Science</SeriesTitle>
</SeriesInfo>
<SeriesHeader><EditorGroup><Editor><EditorName DisplayOrder="Western"><GivenName>David</GivenName>
<FamilyName>Hutchison</FamilyName>
</EditorName>
</Editor>
<Editor><EditorName DisplayOrder="Western"><GivenName>Takeo</GivenName>
<FamilyName>Kanade</FamilyName>
</EditorName>
</Editor>
<Editor><EditorName DisplayOrder="Western"><GivenName>Josef</GivenName>
<FamilyName>Kittler</FamilyName>
</EditorName>
</Editor>
<Editor><EditorName DisplayOrder="Western"><GivenName>Jon</GivenName>
<GivenName>M.</GivenName>
<FamilyName>Kleinberg</FamilyName>
</EditorName>
</Editor>
<Editor><EditorName DisplayOrder="Western"><GivenName>Friedemann</GivenName>
<FamilyName>Mattern</FamilyName>
</EditorName>
</Editor>
<Editor><EditorName DisplayOrder="Western"><GivenName>John</GivenName>
<GivenName>C.</GivenName>
<FamilyName>Mitchell</FamilyName>
</EditorName>
</Editor>
<Editor><EditorName DisplayOrder="Western"><GivenName>Moni</GivenName>
<FamilyName>Naor</FamilyName>
</EditorName>
</Editor>
<Editor><EditorName DisplayOrder="Western"><GivenName>Oscar</GivenName>
<FamilyName>Nierstrasz</FamilyName>
</EditorName>
</Editor>
<Editor><EditorName DisplayOrder="Western"><GivenName>C.</GivenName>
<FamilyName>Pandu Rangan</FamilyName>
</EditorName>
</Editor>
<Editor><EditorName DisplayOrder="Western"><GivenName>Bernhard</GivenName>
<FamilyName>Steffen</FamilyName>
</EditorName>
</Editor>
<Editor><EditorName DisplayOrder="Western"><GivenName>Madhu</GivenName>
<FamilyName>Sudan</FamilyName>
</EditorName>
</Editor>
<Editor><EditorName DisplayOrder="Western"><GivenName>Demetri</GivenName>
<FamilyName>Terzopoulos</FamilyName>
</EditorName>
</Editor>
<Editor><EditorName DisplayOrder="Western"><GivenName>Doug</GivenName>
<FamilyName>Tygar</FamilyName>
</EditorName>
</Editor>
<Editor><EditorName DisplayOrder="Western"><GivenName>Moshe</GivenName>
<GivenName>Y.</GivenName>
<FamilyName>Vardi</FamilyName>
</EditorName>
</Editor>
<Editor><EditorName DisplayOrder="Western"><GivenName>Gerhard</GivenName>
<FamilyName>Weikum</FamilyName>
</EditorName>
</Editor>
</EditorGroup>
</SeriesHeader>
<Book Language="En"><BookInfo BookProductType="Proceedings" ContainsESM="No" Language="En" MediaType="eBook" NumberingStyle="Unnumbered" OutputMedium="All" TocLevels="0"><BookID>978-3-540-69858-6</BookID>
<BookTitle>Natural Language and Information Systems</BookTitle>
<BookSubTitle>13th International Conference on Applications of Natural Language to Information Systems, NLDB 2008 London, UK, June 24-27, 2008 Proceedings</BookSubTitle>
<BookVolumeNumber>5039</BookVolumeNumber>
<BookSequenceNumber>5039</BookSequenceNumber>
<BookDOI>10.1007/978-3-540-69858-6</BookDOI>
<BookTitleID>165040</BookTitleID>
<BookPrintISBN>978-3-540-69857-9</BookPrintISBN>
<BookElectronicISBN>978-3-540-69858-6</BookElectronicISBN>
<BookChapterCount>49</BookChapterCount>
<BookCopyright><CopyrightHolderName>Springer-Verlag Berlin Heidelberg</CopyrightHolderName>
<CopyrightYear>2008</CopyrightYear>
</BookCopyright>
<BookSubjectGroup><BookSubject Code="I" Type="Primary">Computer Science</BookSubject>
<BookSubject Code="I18024" Priority="1" Type="Secondary">Database Management</BookSubject>
<BookSubject Code="I13022" Priority="2" Type="Secondary">Computer Communication Networks</BookSubject>
<BookSubject Code="I18030" Priority="3" Type="Secondary">Data Mining and Knowledge Discovery</BookSubject>
<BookSubject Code="I18032" Priority="4" Type="Secondary">Information Storage and Retrieval</BookSubject>
<BookSubject Code="I16048" Priority="5" Type="Secondary">Mathematical Logic and Formal Languages</BookSubject>
<BookSubject Code="I21017" Priority="6" Type="Secondary">Artificial Intelligence (incl. Robotics)</BookSubject>
<SubjectCollection Code="SUCO11645">Computer Science</SubjectCollection>
</BookSubjectGroup>
</BookInfo>
<BookHeader><EditorGroup><Editor><EditorName DisplayOrder="Western"><GivenName>Epaminondas</GivenName>
<FamilyName>Kapetanios</FamilyName>
</EditorName>
<Contact><Email>e.kapetanios@wmin.ac.uk</Email>
</Contact>
</Editor>
<Editor><EditorName DisplayOrder="Western"><GivenName>Vijayan</GivenName>
<FamilyName>Sugumaran</FamilyName>
</EditorName>
<Contact><Email>sugumara@oakland.edu</Email>
</Contact>
</Editor>
<Editor><EditorName DisplayOrder="Western"><GivenName>Myra</GivenName>
<FamilyName>Spiliopoulou</FamilyName>
</EditorName>
<Contact><Email>myra@iti.cs.uni-magdeburg.de</Email>
</Contact>
</Editor>
</EditorGroup>
</BookHeader>
<Part ID="Part13"><PartInfo TocLevels="0"><PartID>13</PartID>
<PartSequenceNumber>13</PartSequenceNumber>
<PartTitle>Doctoral Symposium Papers</PartTitle>
<PartChapterCount>4</PartChapterCount>
<PartContext><SeriesID>558</SeriesID>
<BookTitle>Natural Language and Information Systems</BookTitle>
</PartContext>
</PartInfo>
<Chapter ID="Chap49" Language="En"><ChapterInfo ChapterType="OriginalPaper" ContainsESM="No" NumberingStyle="Unnumbered" TocLevels="0"><ChapterID>49</ChapterID>
<ChapterDOI>10.1007/978-3-540-69858-6_49</ChapterDOI>
<ChapterSequenceNumber>49</ChapterSequenceNumber>
<ChapterTitle Language="En">Impact of Term-Indexing for Arabic Document Retrieval</ChapterTitle>
<ChapterFirstPage>380</ChapterFirstPage>
<ChapterLastPage>383</ChapterLastPage>
<ChapterCopyright><CopyrightHolderName>Springer-Verlag Berlin Heidelberg</CopyrightHolderName>
<CopyrightYear>2008</CopyrightYear>
</ChapterCopyright>
<ChapterGrants Type="Regular"><MetadataGrant Grant="OpenAccess"></MetadataGrant>
<AbstractGrant Grant="OpenAccess"></AbstractGrant>
<BodyPDFGrant Grant="Restricted"></BodyPDFGrant>
<BodyHTMLGrant Grant="Restricted"></BodyHTMLGrant>
<BibliographyGrant Grant="Restricted"></BibliographyGrant>
<ESMGrant Grant="Restricted"></ESMGrant>
</ChapterGrants>
<ChapterContext><SeriesID>558</SeriesID>
<PartID>13</PartID>
<BookID>978-3-540-69858-6</BookID>
<BookTitle>Natural Language and Information Systems</BookTitle>
</ChapterContext>
</ChapterInfo>
<ChapterHeader><AuthorGroup><Author AffiliationIDS="Aff1 Aff2"><AuthorName DisplayOrder="Western"><GivenName>Siham</GivenName>
<FamilyName>Boulaknadel</FamilyName>
</AuthorName>
<Contact><Email>siham.boulaknadel@univ-nantes.fr</Email>
</Contact>
</Author>
<Affiliation ID="Aff1"><OrgName>LINA FRE CNRS 2729 Université de Nantes</OrgName>
<OrgAddress><Street>2 rue la Houssinière</Street>
<Postbox>BP 92208</Postbox>
<Postcode>44322</Postcode>
<City>Nantes cedex 03</City>
<Country>France</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff2"><OrgName>GSCM Université Mohammed V</OrgName>
<OrgAddress><Postbox>BP 1014</Postbox>
<City>Agdal Rabat-Maroc</City>
</OrgAddress>
</Affiliation>
</AuthorGroup>
<Abstract ID="Abs1" Language="En"><Heading>Abstract</Heading>
<Para>In this paper, we adapt the standard method for multi-word term extraction for Arabic language. We define the linguistic specifications and develop a term extraction tool. We experiment the term extraction program for document retrieval in a specific domain, evaluate two kinds of multi-word term weighting functions considering either the corpus or the document, and demonstrate the efficiency of multi-word term indexing for both weighting up to 5.8% of average precision.</Para>
</Abstract>
</ChapterHeader>
<NoBody></NoBody>
</Chapter>
</Part>
</Book>
</Series>
</Publisher>
</istex:document>
</istex:metadataXml>
<mods version="3.6"><titleInfo lang="en"><title>Impact of Term-Indexing for Arabic Document Retrieval</title>
</titleInfo>
<titleInfo type="alternative" contentType="CDATA" lang="en"><title>Impact of Term-Indexing for Arabic Document Retrieval</title>
</titleInfo>
<name type="personal"><namePart type="given">Siham</namePart>
<namePart type="family">Boulaknadel</namePart>
<affiliation>LINA FRE CNRS 2729 Université de Nantes, 2 rue la Houssinière, BP 92208, 44322, Nantes cedex 03, France</affiliation>
<affiliation>GSCM Université Mohammed V, BP 1014, Agdal Rabat-Maroc</affiliation>
<affiliation>E-mail: siham.boulaknadel@univ-nantes.fr</affiliation>
<role><roleTerm type="text">author</roleTerm>
</role>
</name>
<typeOfResource>text</typeOfResource>
<genre type="conference [eBooks]" displayLabel="OriginalPaper"></genre>
<originInfo><publisher>Springer Berlin Heidelberg</publisher>
<place><placeTerm type="text">Berlin, Heidelberg</placeTerm>
</place>
<dateIssued encoding="w3cdtf">2008</dateIssued>
<copyrightDate encoding="w3cdtf">2008</copyrightDate>
</originInfo>
<language><languageTerm type="code" authority="rfc3066">en</languageTerm>
<languageTerm type="code" authority="iso639-2b">eng</languageTerm>
</language>
<physicalDescription><internetMediaType>text/html</internetMediaType>
</physicalDescription>
<abstract lang="en">Abstract: In this paper, we adapt the standard method for multi-word term extraction for Arabic language. We define the linguistic specifications and develop a term extraction tool. We experiment the term extraction program for document retrieval in a specific domain, evaluate two kinds of multi-word term weighting functions considering either the corpus or the document, and demonstrate the efficiency of multi-word term indexing for both weighting up to 5.8% of average precision.</abstract>
<relatedItem type="host"><titleInfo><title>Natural Language and Information Systems</title>
<subTitle>13th International Conference on Applications of Natural Language to Information Systems, NLDB 2008 London, UK, June 24-27, 2008 Proceedings</subTitle>
</titleInfo>
<name type="personal"><namePart type="given">Epaminondas</namePart>
<namePart type="family">Kapetanios</namePart>
<role><roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal"><namePart type="given">Vijayan</namePart>
<namePart type="family">Sugumaran</namePart>
<role><roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal"><namePart type="given">Myra</namePart>
<namePart type="family">Spiliopoulou</namePart>
<role><roleTerm type="text">editor</roleTerm>
</role>
</name>
<genre type="Book Series" displayLabel="Proceedings"></genre>
<originInfo><copyrightDate encoding="w3cdtf">2008</copyrightDate>
<issuance>monographic</issuance>
</originInfo>
<subject><genre>Book Subject Collection</genre>
<topic authority="SpringerSubjectCodes" authorityURI="SUCO11645">Computer Science</topic>
</subject>
<subject><genre>Book Subject Group</genre>
<topic authority="SpringerSubjectCodes" authorityURI="I">Computer Science</topic>
<topic authority="SpringerSubjectCodes" authorityURI="I18024">Database Management</topic>
<topic authority="SpringerSubjectCodes" authorityURI="I13022">Computer Communication Networks</topic>
<topic authority="SpringerSubjectCodes" authorityURI="I18030">Data Mining and Knowledge Discovery</topic>
<topic authority="SpringerSubjectCodes" authorityURI="I18032">Information Storage and Retrieval</topic>
<topic authority="SpringerSubjectCodes" authorityURI="I16048">Mathematical Logic and Formal Languages</topic>
<topic authority="SpringerSubjectCodes" authorityURI="I21017">Artificial Intelligence (incl. Robotics)</topic>
</subject>
<identifier type="DOI">10.1007/978-3-540-69858-6</identifier>
<identifier type="ISBN">978-3-540-69857-9</identifier>
<identifier type="eISBN">978-3-540-69858-6</identifier>
<identifier type="ISSN">0302-9743</identifier>
<identifier type="eISSN">1611-3349</identifier>
<identifier type="BookTitleID">165040</identifier>
<identifier type="BookID">978-3-540-69858-6</identifier>
<identifier type="BookChapterCount">49</identifier>
<identifier type="BookVolumeNumber">5039</identifier>
<identifier type="BookSequenceNumber">5039</identifier>
<identifier type="PartChapterCount">4</identifier>
<part><date>2008</date>
<detail type="part"><title>Doctoral Symposium Papers</title>
</detail>
<detail type="volume"><number>5039</number>
<caption>vol.</caption>
</detail>
<extent unit="pages"><start>380</start>
<end>383</end>
</extent>
</part>
<recordInfo><recordOrigin>Springer-Verlag Berlin Heidelberg, 2008</recordOrigin>
</recordInfo>
</relatedItem>
<relatedItem type="series"><titleInfo><title>Lecture Notes in Computer Science</title>
</titleInfo>
<name type="personal"><namePart type="given">David</namePart>
<namePart type="family">Hutchison</namePart>
<role><roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal"><namePart type="given">Takeo</namePart>
<namePart type="family">Kanade</namePart>
<role><roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal"><namePart type="given">Josef</namePart>
<namePart type="family">Kittler</namePart>
<role><roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal"><namePart type="given">Jon</namePart>
<namePart type="given">M.</namePart>
<namePart type="family">Kleinberg</namePart>
<role><roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal"><namePart type="given">Friedemann</namePart>
<namePart type="family">Mattern</namePart>
<role><roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal"><namePart type="given">John</namePart>
<namePart type="given">C.</namePart>
<namePart type="family">Mitchell</namePart>
<role><roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal"><namePart type="given">Moni</namePart>
<namePart type="family">Naor</namePart>
<role><roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal"><namePart type="given">Oscar</namePart>
<namePart type="family">Nierstrasz</namePart>
<role><roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal"><namePart type="given">C.</namePart>
<namePart type="family">Pandu Rangan</namePart>
<role><roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal"><namePart type="given">Bernhard</namePart>
<namePart type="family">Steffen</namePart>
<role><roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal"><namePart type="given">Madhu</namePart>
<namePart type="family">Sudan</namePart>
<role><roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal"><namePart type="given">Demetri</namePart>
<namePart type="family">Terzopoulos</namePart>
<role><roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal"><namePart type="given">Doug</namePart>
<namePart type="family">Tygar</namePart>
<role><roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal"><namePart type="given">Moshe</namePart>
<namePart type="given">Y.</namePart>
<namePart type="family">Vardi</namePart>
<role><roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal"><namePart type="given">Gerhard</namePart>
<namePart type="family">Weikum</namePart>
<role><roleTerm type="text">editor</roleTerm>
</role>
</name>
<originInfo><copyrightDate encoding="w3cdtf">2008</copyrightDate>
<issuance>serial</issuance>
</originInfo>
<identifier type="ISSN">0302-9743</identifier>
<identifier type="eISSN">1611-3349</identifier>
<identifier type="SeriesID">558</identifier>
<recordInfo><recordOrigin>Springer-Verlag Berlin Heidelberg, 2008</recordOrigin>
</recordInfo>
</relatedItem>
<identifier type="istex">A73D5FA654634F250F567B70C330A496FCB968A6</identifier>
<identifier type="DOI">10.1007/978-3-540-69858-6_49</identifier>
<identifier type="ChapterID">49</identifier>
<identifier type="ChapterID">Chap49</identifier>
<accessCondition type="use and reproduction" contentType="copyright">Springer-Verlag Berlin Heidelberg, 2008</accessCondition>
<recordInfo><recordContentSource>SPRINGER</recordContentSource>
<recordOrigin>Springer-Verlag Berlin Heidelberg, 2008</recordOrigin>
</recordInfo>
</mods>
</metadata>
<enrichments><istex:refBibTEI uri="https://api.istex.fr/document/A73D5FA654634F250F567B70C330A496FCB968A6/enrichments/refBib"><teiHeader></teiHeader>
<text><front></front>
<body></body>
<back><listBibl><biblStruct xml:id="b0"><analytic><title level="a" type="main">Automatic term detection: A review of current systems</title>
<author><persName><forename type="first">M</forename>
<forename type="middle">T</forename>
<surname>Cabré</surname>
</persName>
</author>
<author><persName><forename type="first">R</forename>
<forename type="middle">E</forename>
<surname>Bagot</surname>
</persName>
</author>
<author><persName><forename type="first">J</forename>
<forename type="middle">V</forename>
<surname>Platresi</surname>
</persName>
</author>
</analytic>
<monogr><title level="m">Recent Advances in Computational Terminology. Natural Language Processing</title>
<editor>Bourigault, D., Jacquemin, C., L'Homme, M.C.</editor>
<meeting><address><addrLine>Amsterdam</addrLine>
</address>
</meeting>
<imprint><date type="published" when="2001"></date>
<biblScope unit="page" from="53" to="88"></biblScope>
</imprint>
</monogr>
</biblStruct>
<biblStruct xml:id="b1"><analytic><title level="a" type="main">Automatic tagging of arabic text: From raw text to base phrase chunks</title>
<author><persName><forename type="first">M</forename>
<surname>Diab</surname>
</persName>
</author>
<author><persName><forename type="first">K</forename>
<surname>Hacioglu</surname>
</persName>
</author>
<author><persName><forename type="first">D</forename>
<surname>Jurafsky</surname>
</persName>
</author>
</analytic>
<monogr><title level="m">Proceedings of HLT-NAACL</title>
<meeting>HLT-NAACL<address><addrLine>Boston</addrLine>
</address>
</meeting>
<imprint><date type="published" when="2004"></date>
<biblScope unit="page" from="149" to="152"></biblScope>
</imprint>
</monogr>
</biblStruct>
<biblStruct xml:id="b2"><analytic><title level="a" type="main">Accurate methods for the statistics of surprise and coincidence</title>
<author><persName><forename type="first">T</forename>
<surname>Dunning</surname>
</persName>
</author>
</analytic>
<monogr><title level="j">Computational Linguistics</title>
<imprint><biblScope unit="volume">19</biblScope>
<biblScope unit="issue">1</biblScope>
<biblScope unit="page" from="61" to="74"></biblScope>
<date type="published" when="1994"></date>
</imprint>
</monogr>
</biblStruct>
<biblStruct xml:id="b3"><analytic><title level="a" type="main">Word association norms, mutual information, and lexicography</title>
<author><persName><forename type="first">W</forename>
<forename type="middle">C</forename>
<surname>Kenneth</surname>
</persName>
</author>
<author><persName><forename type="first">P</forename>
<surname>Hanks</surname>
</persName>
</author>
</analytic>
<monogr><title level="m">Proceedings of the 27th. Annual Meeting of the Association for Computational Linguistics</title>
<meeting>the 27th. Annual Meeting of the Association for Computational Linguistics<address><addrLine>Vancouver, B.C</addrLine>
</address>
</meeting>
<imprint><date type="published" when="1989"></date>
<biblScope unit="page" from="76" to="83"></biblScope>
</imprint>
</monogr>
</biblStruct>
<biblStruct xml:id="b4"><analytic><title level="a" type="main">Using statistics in lexical analysis</title>
<author><persName><forename type="first">K</forename>
<surname>Church</surname>
</persName>
</author>
<author><persName><forename type="first">W</forename>
<surname>Gale</surname>
</persName>
</author>
<author><persName><forename type="first">P</forename>
<surname>Hanks</surname>
</persName>
</author>
<author><persName><forename type="first">D</forename>
<surname>Hindle</surname>
</persName>
</author>
</analytic>
<monogr><title level="m">Lexical Acquisition: Exploiting On-Line Resources to Build a Lexicon</title>
<imprint><date type="published" when="1991"></date>
<biblScope unit="page" from="115" to="164"></biblScope>
</imprint>
</monogr>
</biblStruct>
<biblStruct xml:id="b5"><analytic><title level="a" type="main">Methods of automatic term recognition: a review</title>
<author><persName><forename type="first">K</forename>
<surname>Kageura</surname>
</persName>
</author>
<author><persName><forename type="first">B</forename>
<surname>Umino</surname>
</persName>
</author>
</analytic>
<monogr><title level="j">Terminology</title>
<imprint><biblScope unit="volume">3</biblScope>
<biblScope unit="issue">2</biblScope>
<biblScope unit="page" from="259" to="289"></biblScope>
<date type="published" when="1996"></date>
</imprint>
</monogr>
</biblStruct>
<biblStruct xml:id="b6"><monogr><title level="m" type="main">Probabilistic Methods for Searching OCR-Degraded Arabic Text</title>
<author><persName><forename type="first">K</forename>
<surname>Darwish</surname>
</persName>
</author>
<imprint><date type="published" when="2003"></date>
<pubPlace>Maryland, USA</pubPlace>
</imprint>
</monogr>
</biblStruct>
</listBibl>
</back>
</text>
</istex:refBibTEI>
</enrichments>
</istex>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Istex/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 002933 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Istex/Corpus/biblio.hfd -nk 002933 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Istex |étape= Corpus |type= RBID |clé= ISTEX:A73D5FA654634F250F567B70C330A496FCB968A6 |texte= Impact of Term-Indexing for Arabic Document Retrieval }}
This area was generated with Dilib version V0.6.32. |