Serveur d'exploration sur la recherche en informatique en Lorraine

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

A Neural-Linguistic Approach for the Recognition of a Wide Arabic Word Lexicon

Identifieur interne : 000204 ( PascalFrancis/Corpus ); précédent : 000203; suivant : 000205

A Neural-Linguistic Approach for the Recognition of a Wide Arabic Word Lexicon

Auteurs : I. Ben Cheikh ; A. Kacem ; A. Belaïd

Source :

RBID : Pascal:10-0429694

Descripteurs français

English descriptors

Abstract

Recently, we have investigated the use of Arabic linguistic knowledge to improve the recognition of wide Arabic word lexicon. A neural-linguistic approach was proposed to mainly deal with canonical vocabulary of decomposable words derived from tri-consonant healthy roots. The basic idea is to factorize words by their roots and schemes. In this direction, we conceived two neural networks TNN_R and TNN_S to respectively recognize roots and schemes from structural primitives of words. The proposal approach achieved promising results. In this paper, we will focus on how to reach better results in terms of accuracy and recognition rate. Current improvements concern especially the training stage. It is about 1) to benefit from word letters order 2) to consider "sisters letters" (letters having same features), 3) to supervise networks behaviors, 4) to split up neurons to save letter occurrences and 5) to solve observed ambiguities. Considering theses improvements, experiments carried on 1500 sized vocabulary show a significant enhancement: TNN_R (resp. TNN_S) top4 has gone up from 77% to 85.8% (resp. from 65% to 97.9%). Enlarging the vocabulary from 1000 to 1700, adding 100 words each time, again confirmed the results without altering the networks stability.

Notice en format standard (ISO 2709)

Pour connaître la documentation sur le format Inist Standard.

pA  
A01 01  1    @0 0277-786X
A02 01      @0 PSISDG
A03   1    @0 Proc. SPIE Int. Soc. Opt. Eng.
A05       @2 7534
A08 01  1  ENG  @1 A Neural-Linguistic Approach for the Recognition of a Wide Arabic Word Lexicon
A09 01  1  ENG  @1 Document recognition and retrieval XVII : 19-21 January 2010, San Jose, California, United States
A11 01  1    @1 BEN CHEIKH (I.)
A11 02  1    @1 KACEM (A.)
A11 03  1    @1 BELAÏD (A.)
A12 01  1    @1 LIKFORMAN-SULEM (Laurence) @9 ed.
A12 02  1    @1 AGAM (Gady) @9 ed.
A14 01      @1 UTIC-ESSTT, 5 Avenue Taha Hussein, BP56 Bab Menara @2 1008 Tunis @3 TUN @Z 1 aut. @Z 2 aut.
A14 02      @1 LORIA, Campus scientifique, B.P. 239 @2 54606 Vandœuvre-Lès-Nancy @3 FRA @Z 3 aut.
A18 01  1    @1 SPIE @3 USA @9 org-cong.
A18 02  1    @1 IS&T @3 USA @9 org-cong.
A18 03  1    @1 Institut TELECOM @3 FRA @9 org-cong.
A20       @2 75340L.1-75340L.10
A21       @1 2010
A23 01      @0 ENG
A25 01      @1 SPIE @2 Bellingham WA
A26 01      @0 978-0-8194-7927-3
A43 01      @1 INIST @2 21760 @5 354000174683810200
A44       @0 0000 @1 © 2010 INIST-CNRS. All rights reserved.
A45       @0 10 ref.
A47 01  1    @0 10-0429694
A60       @1 P @2 C
A61       @0 A
A64 01  1    @0 Proceedings of SPIE, the International Society for Optical Engineering
A66 01      @0 USA
C01 01    ENG  @0 Recently, we have investigated the use of Arabic linguistic knowledge to improve the recognition of wide Arabic word lexicon. A neural-linguistic approach was proposed to mainly deal with canonical vocabulary of decomposable words derived from tri-consonant healthy roots. The basic idea is to factorize words by their roots and schemes. In this direction, we conceived two neural networks TNN_R and TNN_S to respectively recognize roots and schemes from structural primitives of words. The proposal approach achieved promising results. In this paper, we will focus on how to reach better results in terms of accuracy and recognition rate. Current improvements concern especially the training stage. It is about 1) to benefit from word letters order 2) to consider "sisters letters" (letters having same features), 3) to supervise networks behaviors, 4) to split up neurons to save letter occurrences and 5) to solve observed ambiguities. Considering theses improvements, experiments carried on 1500 sized vocabulary show a significant enhancement: TNN_R (resp. TNN_S) top4 has gone up from 77% to 85.8% (resp. from 65% to 97.9%). Enlarging the vocabulary from 1000 to 1700, adding 100 words each time, again confirmed the results without altering the networks stability.
C02 01  3    @0 001B00A30C
C02 02  3    @0 001B00G05M
C02 03  3    @0 001B40B30S
C02 04  X    @0 001D02C06
C03 01  3  FRE  @0 Réseau neuronal @5 23
C03 01  3  ENG  @0 Neural networks @5 23
C03 02  3  FRE  @0 Reconnaissance forme @5 61
C03 02  3  ENG  @0 Pattern recognition @5 61
C03 03  X  FRE  @0 Recherche documentaire @5 62
C03 03  X  ENG  @0 Document retrieval @5 62
C03 03  X  SPA  @0 Búsqueda documental @5 62
C03 04  X  FRE  @0 Arabe @5 63
C03 04  X  ENG  @0 Arabic @5 63
C03 04  X  SPA  @0 Árabe @5 63
C03 05  X  FRE  @0 Lexique @5 64
C03 05  X  ENG  @0 Lexicon @5 64
C03 05  X  SPA  @0 Léxico @5 64
C03 06  3  FRE  @0 Vocabulaire @5 65
C03 06  3  ENG  @0 Vocabulary @5 65
C03 07  3  FRE  @0 Consonne @5 66
C03 07  3  ENG  @0 Consonants @5 66
C03 08  3  FRE  @0 Précision @5 67
C03 08  3  ENG  @0 Accuracy @5 67
C03 09  3  FRE  @0 Apprentissage @5 68
C03 09  3  ENG  @0 Learning @5 68
C03 10  3  FRE  @0 0130C @4 INC @5 83
C03 11  3  FRE  @0 0705M @4 INC @5 91
C03 12  3  FRE  @0 4230S @4 INC @5 92
N21       @1 277
N44 01      @1 OTO
N82       @1 OTO
pR  
A30 01  1  ENG  @1 Document recognition and retrieval @2 17 @3 San Jose CA USA @4 2010

Format Inist (serveur)

NO : PASCAL 10-0429694 INIST
ET : A Neural-Linguistic Approach for the Recognition of a Wide Arabic Word Lexicon
AU : BEN CHEIKH (I.); KACEM (A.); BELAÏD (A.); LIKFORMAN-SULEM (Laurence); AGAM (Gady)
AF : UTIC-ESSTT, 5 Avenue Taha Hussein, BP56 Bab Menara/1008 Tunis/Tunisie (1 aut., 2 aut.); LORIA, Campus scientifique, B.P. 239/54606 Vandœuvre-Lès-Nancy/France (3 aut.)
DT : Publication en série; Congrès; Niveau analytique
SO : Proceedings of SPIE, the International Society for Optical Engineering; ISSN 0277-786X; Coden PSISDG; Etats-Unis; Da. 2010; Vol. 7534; 75340L.1-75340L.10; Bibl. 10 ref.
LA : Anglais
EA : Recently, we have investigated the use of Arabic linguistic knowledge to improve the recognition of wide Arabic word lexicon. A neural-linguistic approach was proposed to mainly deal with canonical vocabulary of decomposable words derived from tri-consonant healthy roots. The basic idea is to factorize words by their roots and schemes. In this direction, we conceived two neural networks TNN_R and TNN_S to respectively recognize roots and schemes from structural primitives of words. The proposal approach achieved promising results. In this paper, we will focus on how to reach better results in terms of accuracy and recognition rate. Current improvements concern especially the training stage. It is about 1) to benefit from word letters order 2) to consider "sisters letters" (letters having same features), 3) to supervise networks behaviors, 4) to split up neurons to save letter occurrences and 5) to solve observed ambiguities. Considering theses improvements, experiments carried on 1500 sized vocabulary show a significant enhancement: TNN_R (resp. TNN_S) top4 has gone up from 77% to 85.8% (resp. from 65% to 97.9%). Enlarging the vocabulary from 1000 to 1700, adding 100 words each time, again confirmed the results without altering the networks stability.
CC : 001B00A30C; 001B00G05M; 001B40B30S; 001D02C06
FD : Réseau neuronal; Reconnaissance forme; Recherche documentaire; Arabe; Lexique; Vocabulaire; Consonne; Précision; Apprentissage; 0130C; 0705M; 4230S
ED : Neural networks; Pattern recognition; Document retrieval; Arabic; Lexicon; Vocabulary; Consonants; Accuracy; Learning
SD : Búsqueda documental; Árabe; Léxico
LO : INIST-21760.354000174683810200
ID : 10-0429694

Links to Exploration step

Pascal:10-0429694

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">A Neural-Linguistic Approach for the Recognition of a Wide Arabic Word Lexicon</title>
<author>
<name sortKey="Ben Cheikh, I" sort="Ben Cheikh, I" uniqKey="Ben Cheikh I" first="I." last="Ben Cheikh">I. Ben Cheikh</name>
<affiliation>
<inist:fA14 i1="01">
<s1>UTIC-ESSTT, 5 Avenue Taha Hussein, BP56 Bab Menara</s1>
<s2>1008 Tunis</s2>
<s3>TUN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Kacem, A" sort="Kacem, A" uniqKey="Kacem A" first="A." last="Kacem">A. Kacem</name>
<affiliation>
<inist:fA14 i1="01">
<s1>UTIC-ESSTT, 5 Avenue Taha Hussein, BP56 Bab Menara</s1>
<s2>1008 Tunis</s2>
<s3>TUN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Belaid, A" sort="Belaid, A" uniqKey="Belaid A" first="A." last="Belaïd">A. Belaïd</name>
<affiliation>
<inist:fA14 i1="02">
<s1>LORIA, Campus scientifique, B.P. 239</s1>
<s2>54606 Vandœuvre-Lès-Nancy</s2>
<s3>FRA</s3>
<sZ>3 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">10-0429694</idno>
<date when="2010">2010</date>
<idno type="stanalyst">PASCAL 10-0429694 INIST</idno>
<idno type="RBID">Pascal:10-0429694</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000204</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">A Neural-Linguistic Approach for the Recognition of a Wide Arabic Word Lexicon</title>
<author>
<name sortKey="Ben Cheikh, I" sort="Ben Cheikh, I" uniqKey="Ben Cheikh I" first="I." last="Ben Cheikh">I. Ben Cheikh</name>
<affiliation>
<inist:fA14 i1="01">
<s1>UTIC-ESSTT, 5 Avenue Taha Hussein, BP56 Bab Menara</s1>
<s2>1008 Tunis</s2>
<s3>TUN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Kacem, A" sort="Kacem, A" uniqKey="Kacem A" first="A." last="Kacem">A. Kacem</name>
<affiliation>
<inist:fA14 i1="01">
<s1>UTIC-ESSTT, 5 Avenue Taha Hussein, BP56 Bab Menara</s1>
<s2>1008 Tunis</s2>
<s3>TUN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author>
<name sortKey="Belaid, A" sort="Belaid, A" uniqKey="Belaid A" first="A." last="Belaïd">A. Belaïd</name>
<affiliation>
<inist:fA14 i1="02">
<s1>LORIA, Campus scientifique, B.P. 239</s1>
<s2>54606 Vandœuvre-Lès-Nancy</s2>
<s3>FRA</s3>
<sZ>3 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">Proceedings of SPIE, the International Society for Optical Engineering</title>
<title level="j" type="abbreviated">Proc. SPIE Int. Soc. Opt. Eng.</title>
<idno type="ISSN">0277-786X</idno>
<imprint>
<date when="2010">2010</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">Proceedings of SPIE, the International Society for Optical Engineering</title>
<title level="j" type="abbreviated">Proc. SPIE Int. Soc. Opt. Eng.</title>
<idno type="ISSN">0277-786X</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Accuracy</term>
<term>Arabic</term>
<term>Consonants</term>
<term>Document retrieval</term>
<term>Learning</term>
<term>Lexicon</term>
<term>Neural networks</term>
<term>Pattern recognition</term>
<term>Vocabulary</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Réseau neuronal</term>
<term>Reconnaissance forme</term>
<term>Recherche documentaire</term>
<term>Arabe</term>
<term>Lexique</term>
<term>Vocabulaire</term>
<term>Consonne</term>
<term>Précision</term>
<term>Apprentissage</term>
<term>0130C</term>
<term>0705M</term>
<term>4230S</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Recently, we have investigated the use of Arabic linguistic knowledge to improve the recognition of wide Arabic word lexicon. A neural-linguistic approach was proposed to mainly deal with canonical vocabulary of decomposable words derived from tri-consonant healthy roots. The basic idea is to factorize words by their roots and schemes. In this direction, we conceived two neural networks TNN_R and TNN_S to respectively recognize roots and schemes from structural primitives of words. The proposal approach achieved promising results. In this paper, we will focus on how to reach better results in terms of accuracy and recognition rate. Current improvements concern especially the training stage. It is about 1) to benefit from word letters order 2) to consider "sisters letters" (letters having same features), 3) to supervise networks behaviors, 4) to split up neurons to save letter occurrences and 5) to solve observed ambiguities. Considering theses improvements, experiments carried on 1500 sized vocabulary show a significant enhancement: TNN_R (resp. TNN_S) top4 has gone up from 77% to 85.8% (resp. from 65% to 97.9%). Enlarging the vocabulary from 1000 to 1700, adding 100 words each time, again confirmed the results without altering the networks stability.</div>
</front>
</TEI>
<inist>
<standard h6="B">
<pA>
<fA01 i1="01" i2="1">
<s0>0277-786X</s0>
</fA01>
<fA02 i1="01">
<s0>PSISDG</s0>
</fA02>
<fA03 i2="1">
<s0>Proc. SPIE Int. Soc. Opt. Eng.</s0>
</fA03>
<fA05>
<s2>7534</s2>
</fA05>
<fA08 i1="01" i2="1" l="ENG">
<s1>A Neural-Linguistic Approach for the Recognition of a Wide Arabic Word Lexicon</s1>
</fA08>
<fA09 i1="01" i2="1" l="ENG">
<s1>Document recognition and retrieval XVII : 19-21 January 2010, San Jose, California, United States</s1>
</fA09>
<fA11 i1="01" i2="1">
<s1>BEN CHEIKH (I.)</s1>
</fA11>
<fA11 i1="02" i2="1">
<s1>KACEM (A.)</s1>
</fA11>
<fA11 i1="03" i2="1">
<s1>BELAÏD (A.)</s1>
</fA11>
<fA12 i1="01" i2="1">
<s1>LIKFORMAN-SULEM (Laurence)</s1>
<s9>ed.</s9>
</fA12>
<fA12 i1="02" i2="1">
<s1>AGAM (Gady)</s1>
<s9>ed.</s9>
</fA12>
<fA14 i1="01">
<s1>UTIC-ESSTT, 5 Avenue Taha Hussein, BP56 Bab Menara</s1>
<s2>1008 Tunis</s2>
<s3>TUN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</fA14>
<fA14 i1="02">
<s1>LORIA, Campus scientifique, B.P. 239</s1>
<s2>54606 Vandœuvre-Lès-Nancy</s2>
<s3>FRA</s3>
<sZ>3 aut.</sZ>
</fA14>
<fA18 i1="01" i2="1">
<s1>SPIE</s1>
<s3>USA</s3>
<s9>org-cong.</s9>
</fA18>
<fA18 i1="02" i2="1">
<s1>IS&T</s1>
<s3>USA</s3>
<s9>org-cong.</s9>
</fA18>
<fA18 i1="03" i2="1">
<s1>Institut TELECOM</s1>
<s3>FRA</s3>
<s9>org-cong.</s9>
</fA18>
<fA20>
<s2>75340L.1-75340L.10</s2>
</fA20>
<fA21>
<s1>2010</s1>
</fA21>
<fA23 i1="01">
<s0>ENG</s0>
</fA23>
<fA25 i1="01">
<s1>SPIE</s1>
<s2>Bellingham WA</s2>
</fA25>
<fA26 i1="01">
<s0>978-0-8194-7927-3</s0>
</fA26>
<fA43 i1="01">
<s1>INIST</s1>
<s2>21760</s2>
<s5>354000174683810200</s5>
</fA43>
<fA44>
<s0>0000</s0>
<s1>© 2010 INIST-CNRS. All rights reserved.</s1>
</fA44>
<fA45>
<s0>10 ref.</s0>
</fA45>
<fA47 i1="01" i2="1">
<s0>10-0429694</s0>
</fA47>
<fA60>
<s1>P</s1>
<s2>C</s2>
</fA60>
<fA61>
<s0>A</s0>
</fA61>
<fA64 i1="01" i2="1">
<s0>Proceedings of SPIE, the International Society for Optical Engineering</s0>
</fA64>
<fA66 i1="01">
<s0>USA</s0>
</fA66>
<fC01 i1="01" l="ENG">
<s0>Recently, we have investigated the use of Arabic linguistic knowledge to improve the recognition of wide Arabic word lexicon. A neural-linguistic approach was proposed to mainly deal with canonical vocabulary of decomposable words derived from tri-consonant healthy roots. The basic idea is to factorize words by their roots and schemes. In this direction, we conceived two neural networks TNN_R and TNN_S to respectively recognize roots and schemes from structural primitives of words. The proposal approach achieved promising results. In this paper, we will focus on how to reach better results in terms of accuracy and recognition rate. Current improvements concern especially the training stage. It is about 1) to benefit from word letters order 2) to consider "sisters letters" (letters having same features), 3) to supervise networks behaviors, 4) to split up neurons to save letter occurrences and 5) to solve observed ambiguities. Considering theses improvements, experiments carried on 1500 sized vocabulary show a significant enhancement: TNN_R (resp. TNN_S) top4 has gone up from 77% to 85.8% (resp. from 65% to 97.9%). Enlarging the vocabulary from 1000 to 1700, adding 100 words each time, again confirmed the results without altering the networks stability.</s0>
</fC01>
<fC02 i1="01" i2="3">
<s0>001B00A30C</s0>
</fC02>
<fC02 i1="02" i2="3">
<s0>001B00G05M</s0>
</fC02>
<fC02 i1="03" i2="3">
<s0>001B40B30S</s0>
</fC02>
<fC02 i1="04" i2="X">
<s0>001D02C06</s0>
</fC02>
<fC03 i1="01" i2="3" l="FRE">
<s0>Réseau neuronal</s0>
<s5>23</s5>
</fC03>
<fC03 i1="01" i2="3" l="ENG">
<s0>Neural networks</s0>
<s5>23</s5>
</fC03>
<fC03 i1="02" i2="3" l="FRE">
<s0>Reconnaissance forme</s0>
<s5>61</s5>
</fC03>
<fC03 i1="02" i2="3" l="ENG">
<s0>Pattern recognition</s0>
<s5>61</s5>
</fC03>
<fC03 i1="03" i2="X" l="FRE">
<s0>Recherche documentaire</s0>
<s5>62</s5>
</fC03>
<fC03 i1="03" i2="X" l="ENG">
<s0>Document retrieval</s0>
<s5>62</s5>
</fC03>
<fC03 i1="03" i2="X" l="SPA">
<s0>Búsqueda documental</s0>
<s5>62</s5>
</fC03>
<fC03 i1="04" i2="X" l="FRE">
<s0>Arabe</s0>
<s5>63</s5>
</fC03>
<fC03 i1="04" i2="X" l="ENG">
<s0>Arabic</s0>
<s5>63</s5>
</fC03>
<fC03 i1="04" i2="X" l="SPA">
<s0>Árabe</s0>
<s5>63</s5>
</fC03>
<fC03 i1="05" i2="X" l="FRE">
<s0>Lexique</s0>
<s5>64</s5>
</fC03>
<fC03 i1="05" i2="X" l="ENG">
<s0>Lexicon</s0>
<s5>64</s5>
</fC03>
<fC03 i1="05" i2="X" l="SPA">
<s0>Léxico</s0>
<s5>64</s5>
</fC03>
<fC03 i1="06" i2="3" l="FRE">
<s0>Vocabulaire</s0>
<s5>65</s5>
</fC03>
<fC03 i1="06" i2="3" l="ENG">
<s0>Vocabulary</s0>
<s5>65</s5>
</fC03>
<fC03 i1="07" i2="3" l="FRE">
<s0>Consonne</s0>
<s5>66</s5>
</fC03>
<fC03 i1="07" i2="3" l="ENG">
<s0>Consonants</s0>
<s5>66</s5>
</fC03>
<fC03 i1="08" i2="3" l="FRE">
<s0>Précision</s0>
<s5>67</s5>
</fC03>
<fC03 i1="08" i2="3" l="ENG">
<s0>Accuracy</s0>
<s5>67</s5>
</fC03>
<fC03 i1="09" i2="3" l="FRE">
<s0>Apprentissage</s0>
<s5>68</s5>
</fC03>
<fC03 i1="09" i2="3" l="ENG">
<s0>Learning</s0>
<s5>68</s5>
</fC03>
<fC03 i1="10" i2="3" l="FRE">
<s0>0130C</s0>
<s4>INC</s4>
<s5>83</s5>
</fC03>
<fC03 i1="11" i2="3" l="FRE">
<s0>0705M</s0>
<s4>INC</s4>
<s5>91</s5>
</fC03>
<fC03 i1="12" i2="3" l="FRE">
<s0>4230S</s0>
<s4>INC</s4>
<s5>92</s5>
</fC03>
<fN21>
<s1>277</s1>
</fN21>
<fN44 i1="01">
<s1>OTO</s1>
</fN44>
<fN82>
<s1>OTO</s1>
</fN82>
</pA>
<pR>
<fA30 i1="01" i2="1" l="ENG">
<s1>Document recognition and retrieval</s1>
<s2>17</s2>
<s3>San Jose CA USA</s3>
<s4>2010</s4>
</fA30>
</pR>
</standard>
<server>
<NO>PASCAL 10-0429694 INIST</NO>
<ET>A Neural-Linguistic Approach for the Recognition of a Wide Arabic Word Lexicon</ET>
<AU>BEN CHEIKH (I.); KACEM (A.); BELAÏD (A.); LIKFORMAN-SULEM (Laurence); AGAM (Gady)</AU>
<AF>UTIC-ESSTT, 5 Avenue Taha Hussein, BP56 Bab Menara/1008 Tunis/Tunisie (1 aut., 2 aut.); LORIA, Campus scientifique, B.P. 239/54606 Vandœuvre-Lès-Nancy/France (3 aut.)</AF>
<DT>Publication en série; Congrès; Niveau analytique</DT>
<SO>Proceedings of SPIE, the International Society for Optical Engineering; ISSN 0277-786X; Coden PSISDG; Etats-Unis; Da. 2010; Vol. 7534; 75340L.1-75340L.10; Bibl. 10 ref.</SO>
<LA>Anglais</LA>
<EA>Recently, we have investigated the use of Arabic linguistic knowledge to improve the recognition of wide Arabic word lexicon. A neural-linguistic approach was proposed to mainly deal with canonical vocabulary of decomposable words derived from tri-consonant healthy roots. The basic idea is to factorize words by their roots and schemes. In this direction, we conceived two neural networks TNN_R and TNN_S to respectively recognize roots and schemes from structural primitives of words. The proposal approach achieved promising results. In this paper, we will focus on how to reach better results in terms of accuracy and recognition rate. Current improvements concern especially the training stage. It is about 1) to benefit from word letters order 2) to consider "sisters letters" (letters having same features), 3) to supervise networks behaviors, 4) to split up neurons to save letter occurrences and 5) to solve observed ambiguities. Considering theses improvements, experiments carried on 1500 sized vocabulary show a significant enhancement: TNN_R (resp. TNN_S) top4 has gone up from 77% to 85.8% (resp. from 65% to 97.9%). Enlarging the vocabulary from 1000 to 1700, adding 100 words each time, again confirmed the results without altering the networks stability.</EA>
<CC>001B00A30C; 001B00G05M; 001B40B30S; 001D02C06</CC>
<FD>Réseau neuronal; Reconnaissance forme; Recherche documentaire; Arabe; Lexique; Vocabulaire; Consonne; Précision; Apprentissage; 0130C; 0705M; 4230S</FD>
<ED>Neural networks; Pattern recognition; Document retrieval; Arabic; Lexicon; Vocabulary; Consonants; Accuracy; Learning</ED>
<SD>Búsqueda documental; Árabe; Léxico</SD>
<LO>INIST-21760.354000174683810200</LO>
<ID>10-0429694</ID>
</server>
</inist>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Lorraine/explor/InforLorV4/Data/PascalFrancis/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000204 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/PascalFrancis/Corpus/biblio.hfd -nk 000204 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Lorraine
   |area=    InforLorV4
   |flux=    PascalFrancis
   |étape=   Corpus
   |type=    RBID
   |clé=     Pascal:10-0429694
   |texte=   A Neural-Linguistic Approach for the Recognition of a Wide Arabic Word Lexicon
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Jun 10 21:56:28 2019. Site generation: Fri Feb 25 15:29:27 2022