InforLorV4, PascalFrancis, Corpus, bibRecord, 000204

A Neural-Linguistic Approach for the Recognition of a Wide Arabic Word Lexicon

Identifieur interne : 000204 ( PascalFrancis/Corpus ); précédent : 000203; suivant : 000205

A Neural-Linguistic Approach for the Recognition of a Wide Arabic Word Lexicon

Auteurs : I. Ben Cheikh ; A. Kacem ; A. Belaïd

Source :

Proceedings of SPIE, the International Society for Optical Engineering [ 0277-786X ] ; 2010.

RBID : Pascal:10-0429694

Descripteurs français

Pascal (Inist)
- Réseau neuronal, Reconnaissance forme, Recherche documentaire, Arabe, Lexique, Vocabulaire, Consonne, Précision, Apprentissage, 0130C, 0705M, 4230S.

English descriptors

KwdEn :
- Accuracy, Arabic, Consonants, Document retrieval, Learning, Lexicon, Neural networks, Pattern recognition, Vocabulary.

Abstract

Recently, we have investigated the use of Arabic linguistic knowledge to improve the recognition of wide Arabic word lexicon. A neural-linguistic approach was proposed to mainly deal with canonical vocabulary of decomposable words derived from tri-consonant healthy roots. The basic idea is to factorize words by their roots and schemes. In this direction, we conceived two neural networks TNN_R and TNN_S to respectively recognize roots and schemes from structural primitives of words. The proposal approach achieved promising results. In this paper, we will focus on how to reach better results in terms of accuracy and recognition rate. Current improvements concern especially the training stage. It is about 1) to benefit from word letters order 2) to consider "sisters letters" (letters having same features), 3) to supervise networks behaviors, 4) to split up neurons to save letter occurrences and 5) to solve observed ambiguities. Considering theses improvements, experiments carried on 1500 sized vocabulary show a significant enhancement: TNN_R (resp. TNN_S) top4 has gone up from 77% to 85.8% (resp. from 65% to 97.9%). Enlarging the vocabulary from 1000 to 1700, adding 100 words each time, again confirmed the results without altering the networks stability.

Notice en format standard (ISO 2709)

Pour connaître la documentation sur le format Inist Standard.

A01	`01`	`1`		`@0 0277-786X`
A02	`01`			`@0 PSISDG`
A03		`1`		`@0 Proc. SPIE Int. Soc. Opt. Eng.`
A05				`@2 7534`
A08	`01`	`1`	`ENG`	`@1 A Neural-Linguistic Approach for the Recognition of a Wide Arabic Word Lexicon`
A09	`01`	`1`	`ENG`	`@1 Document recognition and retrieval XVII : 19-21 January 2010, San Jose, California, United States`
A11	`01`	`1`		`@1 BEN CHEIKH (I.)`
A11	`02`	`1`		`@1 KACEM (A.)`
A11	`03`	`1`		`@1 BELAÏD (A.)`
A12	`01`	`1`		`@1 LIKFORMAN-SULEM (Laurence) @9 ed.`
A12	`02`	`1`		`@1 AGAM (Gady) @9 ed.`
A14	`01`			`@1 UTIC-ESSTT, 5 Avenue Taha Hussein, BP56 Bab Menara @2 1008 Tunis @3 TUN @Z 1 aut. @Z 2 aut.`
A14	`02`			`@1 LORIA, Campus scientifique, B.P. 239 @2 54606 Vandœuvre-Lès-Nancy @3 FRA @Z 3 aut.`
A18	`01`	`1`		`@1 SPIE @3 USA @9 org-cong.`
A18	`02`	`1`		`@1 IS&T @3 USA @9 org-cong.`
A18	`03`	`1`		`@1 Institut TELECOM @3 FRA @9 org-cong.`
A20				`@2 75340L.1-75340L.10`
A21				`@1 2010`
A23	`01`			`@0 ENG`
A25	`01`			`@1 SPIE @2 Bellingham WA`
A26	`01`			`@0 978-0-8194-7927-3`
A43	`01`			`@1 INIST @2 21760 @5 354000174683810200`
A44				`@0 0000 @1 © 2010 INIST-CNRS. All rights reserved.`
A45				`@0 10 ref.`
A47	`01`	`1`		`@0 10-0429694`
A60				`@1 P @2 C`
A61				`@0 A`
A64	`01`	`1`		`@0 Proceedings of SPIE, the International Society for Optical Engineering`
A66	`01`			`@0 USA`
C01	`01`		`ENG`	@0 Recently, we have investigated the use of Arabic linguistic knowledge to improve the recognition of wide Arabic word lexicon. A neural-linguistic approach was proposed to mainly deal with canonical vocabulary of decomposable words derived from tri-consonant healthy roots. The basic idea is to factorize words by their roots and schemes. In this direction, we conceived two neural networks TNN_R and TNN_S to respectively recognize roots and schemes from structural primitives of words. The proposal approach achieved promising results. In this paper, we will focus on how to reach better results in terms of accuracy and recognition rate. Current improvements concern especially the training stage. It is about 1) to benefit from word letters order 2) to consider "sisters letters" (letters having same features), 3) to supervise networks behaviors, 4) to split up neurons to save letter occurrences and 5) to solve observed ambiguities. Considering theses improvements, experiments carried on 1500 sized vocabulary show a significant enhancement: TNN_R (resp. TNN_S) top4 has gone up from 77% to 85.8% (resp. from 65% to 97.9%). Enlarging the vocabulary from 1000 to 1700, adding 100 words each time, again confirmed the results without altering the networks stability.
C02	`01`	`3`		`@0 001B00A30C`
C02	`02`	`3`		`@0 001B00G05M`
C02	`03`	`3`		`@0 001B40B30S`
C02	`04`	`X`		`@0 001D02C06`
C03	`01`	`3`	`FRE`	`@0 Réseau neuronal @5 23`
C03	`01`	`3`	`ENG`	`@0 Neural networks @5 23`
C03	`02`	`3`	`FRE`	`@0 Reconnaissance forme @5 61`
C03	`02`	`3`	`ENG`	`@0 Pattern recognition @5 61`
C03	`03`	`X`	`FRE`	`@0 Recherche documentaire @5 62`
C03	`03`	`X`	`ENG`	`@0 Document retrieval @5 62`
C03	`03`	`X`	`SPA`	`@0 Búsqueda documental @5 62`
C03	`04`	`X`	`FRE`	`@0 Arabe @5 63`
C03	`04`	`X`	`ENG`	`@0 Arabic @5 63`
C03	`04`	`X`	`SPA`	`@0 Árabe @5 63`
C03	`05`	`X`	`FRE`	`@0 Lexique @5 64`
C03	`05`	`X`	`ENG`	`@0 Lexicon @5 64`
C03	`05`	`X`	`SPA`	`@0 Léxico @5 64`
C03	`06`	`3`	`FRE`	`@0 Vocabulaire @5 65`
C03	`06`	`3`	`ENG`	`@0 Vocabulary @5 65`
C03	`07`	`3`	`FRE`	`@0 Consonne @5 66`
C03	`07`	`3`	`ENG`	`@0 Consonants @5 66`
C03	`08`	`3`	`FRE`	`@0 Précision @5 67`
C03	`08`	`3`	`ENG`	`@0 Accuracy @5 67`
C03	`09`	`3`	`FRE`	`@0 Apprentissage @5 68`
C03	`09`	`3`	`ENG`	`@0 Learning @5 68`
C03	`10`	`3`	`FRE`	`@0 0130C @4 INC @5 83`
C03	`11`	`3`	`FRE`	`@0 0705M @4 INC @5 91`
C03	`12`	`3`	`FRE`	`@0 4230S @4 INC @5 92`
N21				`@1 277`
N44	`01`			`@1 OTO`
N82				`@1 OTO`

A30	`01`	`1`	`ENG`	`@1 Document recognition and retrieval @2 17 @3 San Jose CA USA @4 2010`

Format Inist (serveur)

NO :	PASCAL 10-0429694 INIST
ET :	A Neural-Linguistic Approach for the Recognition of a Wide Arabic Word Lexicon
AU :	BEN CHEIKH (I.); KACEM (A.); BELAÏD (A.); LIKFORMAN-SULEM (Laurence); AGAM (Gady)
AF :	UTIC-ESSTT, 5 Avenue Taha Hussein, BP56 Bab Menara/1008 Tunis/Tunisie (1 aut., 2 aut.); LORIA, Campus scientifique, B.P. 239/54606 Vandœuvre-Lès-Nancy/France (3 aut.)
DT :	Publication en série; Congrès; Niveau analytique
SO :	Proceedings of SPIE, the International Society for Optical Engineering; ISSN 0277-786X; Coden PSISDG; Etats-Unis; Da. 2010; Vol. 7534; 75340L.1-75340L.10; Bibl. 10 ref.
LA :	Anglais
EA :	Recently, we have investigated the use of Arabic linguistic knowledge to improve the recognition of wide Arabic word lexicon. A neural-linguistic approach was proposed to mainly deal with canonical vocabulary of decomposable words derived from tri-consonant healthy roots. The basic idea is to factorize words by their roots and schemes. In this direction, we conceived two neural networks TNN_R and TNN_S to respectively recognize roots and schemes from structural primitives of words. The proposal approach achieved promising results. In this paper, we will focus on how to reach better results in terms of accuracy and recognition rate. Current improvements concern especially the training stage. It is about 1) to benefit from word letters order 2) to consider "sisters letters" (letters having same features), 3) to supervise networks behaviors, 4) to split up neurons to save letter occurrences and 5) to solve observed ambiguities. Considering theses improvements, experiments carried on 1500 sized vocabulary show a significant enhancement: TNN_R (resp. TNN_S) top4 has gone up from 77% to 85.8% (resp. from 65% to 97.9%). Enlarging the vocabulary from 1000 to 1700, adding 100 words each time, again confirmed the results without altering the networks stability.
CC :	001B00A30C; 001B00G05M; 001B40B30S; 001D02C06
FD :	Réseau neuronal; Reconnaissance forme; Recherche documentaire; Arabe; Lexique; Vocabulaire; Consonne; Précision; Apprentissage; 0130C; 0705M; 4230S
ED :	Neural networks; Pattern recognition; Document retrieval; Arabic; Lexicon; Vocabulary; Consonants; Accuracy; Learning
SD :	Búsqueda documental; Árabe; Léxico
LO :	INIST-21760.354000174683810200
ID :	10-0429694

Links to Exploration step

Pascal:10-0429694

Le document en format XML

<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">A Neural-Linguistic Approach for the Recognition of a Wide Arabic Word Lexicon</title>
<author><name sortKey="Ben Cheikh, I" sort="Ben Cheikh, I" uniqKey="Ben Cheikh I" first="I." last="Ben Cheikh">I. Ben Cheikh</name>
<affiliation><inist:fA14 i1="01"><s1>UTIC-ESSTT, 5 Avenue Taha Hussein, BP56 Bab Menara</s1>
<s2>1008 Tunis</s2>
<s3>TUN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Kacem, A" sort="Kacem, A" uniqKey="Kacem A" first="A." last="Kacem">A. Kacem</name>
<affiliation><inist:fA14 i1="01"><s1>UTIC-ESSTT, 5 Avenue Taha Hussein, BP56 Bab Menara</s1>
<s2>1008 Tunis</s2>
<s3>TUN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Belaid, A" sort="Belaid, A" uniqKey="Belaid A" first="A." last="Belaïd">A. Belaïd</name>
<affiliation><inist:fA14 i1="02"><s1>LORIA, Campus scientifique, B.P. 239</s1>
<s2>54606 Vandœuvre-Lès-Nancy</s2>
<s3>FRA</s3>
<sZ>3 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">10-0429694</idno>
<date when="2010">2010</date>
<idno type="stanalyst">PASCAL 10-0429694 INIST</idno>
<idno type="RBID">Pascal:10-0429694</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000204</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">A Neural-Linguistic Approach for the Recognition of a Wide Arabic Word Lexicon</title>
<author><name sortKey="Ben Cheikh, I" sort="Ben Cheikh, I" uniqKey="Ben Cheikh I" first="I." last="Ben Cheikh">I. Ben Cheikh</name>
<affiliation><inist:fA14 i1="01"><s1>UTIC-ESSTT, 5 Avenue Taha Hussein, BP56 Bab Menara</s1>
<s2>1008 Tunis</s2>
<s3>TUN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Kacem, A" sort="Kacem, A" uniqKey="Kacem A" first="A." last="Kacem">A. Kacem</name>
<affiliation><inist:fA14 i1="01"><s1>UTIC-ESSTT, 5 Avenue Taha Hussein, BP56 Bab Menara</s1>
<s2>1008 Tunis</s2>
<s3>TUN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Belaid, A" sort="Belaid, A" uniqKey="Belaid A" first="A." last="Belaïd">A. Belaïd</name>
<affiliation><inist:fA14 i1="02"><s1>LORIA, Campus scientifique, B.P. 239</s1>
<s2>54606 Vandœuvre-Lès-Nancy</s2>
<s3>FRA</s3>
<sZ>3 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">Proceedings of SPIE, the International Society for Optical Engineering</title>
<title level="j" type="abbreviated">Proc. SPIE Int. Soc. Opt. Eng.</title>
<idno type="ISSN">0277-786X</idno>
<imprint><date when="2010">2010</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">Proceedings of SPIE, the International Society for Optical Engineering</title>
<title level="j" type="abbreviated">Proc. SPIE Int. Soc. Opt. Eng.</title>
<idno type="ISSN">0277-786X</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Accuracy</term>
<term>Arabic</term>
<term>Consonants</term>
<term>Document retrieval</term>
<term>Learning</term>
<term>Lexicon</term>
<term>Neural networks</term>
<term>Pattern recognition</term>
<term>Vocabulary</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Réseau neuronal</term>
<term>Reconnaissance forme</term>
<term>Recherche documentaire</term>
<term>Arabe</term>
<term>Lexique</term>
<term>Vocabulaire</term>
<term>Consonne</term>
<term>Précision</term>
<term>Apprentissage</term>
<term>0130C</term>
<term>0705M</term>
<term>4230S</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Recently, we have investigated the use of Arabic linguistic knowledge to improve the recognition of wide Arabic word lexicon. A neural-linguistic approach was proposed to mainly deal with canonical vocabulary of decomposable words derived from tri-consonant healthy roots. The basic idea is to factorize words by their roots and schemes. In this direction, we conceived two neural networks TNN_R and TNN_S to respectively recognize roots and schemes from structural primitives of words. The proposal approach achieved promising results. In this paper, we will focus on how to reach better results in terms of accuracy and recognition rate. Current improvements concern especially the training stage. It is about 1) to benefit from word letters order 2) to consider "sisters letters" (letters having same features), 3) to supervise networks behaviors, 4) to split up neurons to save letter occurrences and 5) to solve observed ambiguities. Considering theses improvements, experiments carried on 1500 sized vocabulary show a significant enhancement: TNN_R (resp. TNN_S) top4 has gone up from 77% to 85.8% (resp. from 65% to 97.9%). Enlarging the vocabulary from 1000 to 1700, adding 100 words each time, again confirmed the results without altering the networks stability.</div>
</front>
</TEI>
<inist><standard h6="B"><pA><fA01 i1="01" i2="1"><s0>0277-786X</s0>
</fA01>
<fA02 i1="01"><s0>PSISDG</s0>
</fA02>
<fA03 i2="1"><s0>Proc. SPIE Int. Soc. Opt. Eng.</s0>
</fA03>
<fA05><s2>7534</s2>
</fA05>
<fA08 i1="01" i2="1" l="ENG"><s1>A Neural-Linguistic Approach for the Recognition of a Wide Arabic Word Lexicon</s1>
</fA08>
<fA09 i1="01" i2="1" l="ENG"><s1>Document recognition and retrieval XVII : 19-21 January 2010, San Jose, California, United States</s1>
</fA09>
<fA11 i1="01" i2="1"><s1>BEN CHEIKH (I.)</s1>
</fA11>
<fA11 i1="02" i2="1"><s1>KACEM (A.)</s1>
</fA11>
<fA11 i1="03" i2="1"><s1>BELAÏD (A.)</s1>
</fA11>
<fA12 i1="01" i2="1"><s1>LIKFORMAN-SULEM (Laurence)</s1>
<s9>ed.</s9>
</fA12>
<fA12 i1="02" i2="1"><s1>AGAM (Gady)</s1>
<s9>ed.</s9>
</fA12>
<fA14 i1="01"><s1>UTIC-ESSTT, 5 Avenue Taha Hussein, BP56 Bab Menara</s1>
<s2>1008 Tunis</s2>
<s3>TUN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</fA14>
<fA14 i1="02"><s1>LORIA, Campus scientifique, B.P. 239</s1>
<s2>54606 Vandœuvre-Lès-Nancy</s2>
<s3>FRA</s3>
<sZ>3 aut.</sZ>
</fA14>
<fA18 i1="01" i2="1"><s1>SPIE</s1>
<s3>USA</s3>
<s9>org-cong.</s9>
</fA18>
<fA18 i1="02" i2="1"><s1>IS&T</s1>
<s3>USA</s3>
<s9>org-cong.</s9>
</fA18>
<fA18 i1="03" i2="1"><s1>Institut TELECOM</s1>
<s3>FRA</s3>
<s9>org-cong.</s9>
</fA18>
<fA20><s2>75340L.1-75340L.10</s2>
</fA20>
<fA21><s1>2010</s1>
</fA21>
<fA23 i1="01"><s0>ENG</s0>
</fA23>
<fA25 i1="01"><s1>SPIE</s1>
<s2>Bellingham WA</s2>
</fA25>
<fA26 i1="01"><s0>978-0-8194-7927-3</s0>
</fA26>
<fA43 i1="01"><s1>INIST</s1>
<s2>21760</s2>
<s5>354000174683810200</s5>
</fA43>
<fA44><s0>0000</s0>
<s1>© 2010 INIST-CNRS. All rights reserved.</s1>
</fA44>
<fA45><s0>10 ref.</s0>
</fA45>
<fA47 i1="01" i2="1"><s0>10-0429694</s0>
</fA47>
<fA60><s1>P</s1>
<s2>C</s2>
</fA60>
<fA61><s0>A</s0>
</fA61>
<fA64 i1="01" i2="1"><s0>Proceedings of SPIE, the International Society for Optical Engineering</s0>
</fA64>
<fA66 i1="01"><s0>USA</s0>
</fA66>
<fC01 i1="01" l="ENG"><s0>Recently, we have investigated the use of Arabic linguistic knowledge to improve the recognition of wide Arabic word lexicon. A neural-linguistic approach was proposed to mainly deal with canonical vocabulary of decomposable words derived from tri-consonant healthy roots. The basic idea is to factorize words by their roots and schemes. In this direction, we conceived two neural networks TNN_R and TNN_S to respectively recognize roots and schemes from structural primitives of words. The proposal approach achieved promising results. In this paper, we will focus on how to reach better results in terms of accuracy and recognition rate. Current improvements concern especially the training stage. It is about 1) to benefit from word letters order 2) to consider "sisters letters" (letters having same features), 3) to supervise networks behaviors, 4) to split up neurons to save letter occurrences and 5) to solve observed ambiguities. Considering theses improvements, experiments carried on 1500 sized vocabulary show a significant enhancement: TNN_R (resp. TNN_S) top4 has gone up from 77% to 85.8% (resp. from 65% to 97.9%). Enlarging the vocabulary from 1000 to 1700, adding 100 words each time, again confirmed the results without altering the networks stability.</s0>
</fC01>
<fC02 i1="01" i2="3"><s0>001B00A30C</s0>
</fC02>
<fC02 i1="02" i2="3"><s0>001B00G05M</s0>
</fC02>
<fC02 i1="03" i2="3"><s0>001B40B30S</s0>
</fC02>
<fC02 i1="04" i2="X"><s0>001D02C06</s0>
</fC02>
<fC03 i1="01" i2="3" l="FRE"><s0>Réseau neuronal</s0>
<s5>23</s5>
</fC03>
<fC03 i1="01" i2="3" l="ENG"><s0>Neural networks</s0>
<s5>23</s5>
</fC03>
<fC03 i1="02" i2="3" l="FRE"><s0>Reconnaissance forme</s0>
<s5>61</s5>
</fC03>
<fC03 i1="02" i2="3" l="ENG"><s0>Pattern recognition</s0>
<s5>61</s5>
</fC03>
<fC03 i1="03" i2="X" l="FRE"><s0>Recherche documentaire</s0>
<s5>62</s5>
</fC03>
<fC03 i1="03" i2="X" l="ENG"><s0>Document retrieval</s0>
<s5>62</s5>
</fC03>
<fC03 i1="03" i2="X" l="SPA"><s0>Búsqueda documental</s0>
<s5>62</s5>
</fC03>
<fC03 i1="04" i2="X" l="FRE"><s0>Arabe</s0>
<s5>63</s5>
</fC03>
<fC03 i1="04" i2="X" l="ENG"><s0>Arabic</s0>
<s5>63</s5>
</fC03>
<fC03 i1="04" i2="X" l="SPA"><s0>Árabe</s0>
<s5>63</s5>
</fC03>
<fC03 i1="05" i2="X" l="FRE"><s0>Lexique</s0>
<s5>64</s5>
</fC03>
<fC03 i1="05" i2="X" l="ENG"><s0>Lexicon</s0>
<s5>64</s5>
</fC03>
<fC03 i1="05" i2="X" l="SPA"><s0>Léxico</s0>
<s5>64</s5>
</fC03>
<fC03 i1="06" i2="3" l="FRE"><s0>Vocabulaire</s0>
<s5>65</s5>
</fC03>
<fC03 i1="06" i2="3" l="ENG"><s0>Vocabulary</s0>
<s5>65</s5>
</fC03>
<fC03 i1="07" i2="3" l="FRE"><s0>Consonne</s0>
<s5>66</s5>
</fC03>
<fC03 i1="07" i2="3" l="ENG"><s0>Consonants</s0>
<s5>66</s5>
</fC03>
<fC03 i1="08" i2="3" l="FRE"><s0>Précision</s0>
<s5>67</s5>
</fC03>
<fC03 i1="08" i2="3" l="ENG"><s0>Accuracy</s0>
<s5>67</s5>
</fC03>
<fC03 i1="09" i2="3" l="FRE"><s0>Apprentissage</s0>
<s5>68</s5>
</fC03>
<fC03 i1="09" i2="3" l="ENG"><s0>Learning</s0>
<s5>68</s5>
</fC03>
<fC03 i1="10" i2="3" l="FRE"><s0>0130C</s0>
<s4>INC</s4>
<s5>83</s5>
</fC03>
<fC03 i1="11" i2="3" l="FRE"><s0>0705M</s0>
<s4>INC</s4>
<s5>91</s5>
</fC03>
<fC03 i1="12" i2="3" l="FRE"><s0>4230S</s0>
<s4>INC</s4>
<s5>92</s5>
</fC03>
<fN21><s1>277</s1>
</fN21>
<fN44 i1="01"><s1>OTO</s1>
</fN44>
<fN82><s1>OTO</s1>
</fN82>
</pA>
<pR><fA30 i1="01" i2="1" l="ENG"><s1>Document recognition and retrieval</s1>
<s2>17</s2>
<s3>San Jose CA USA</s3>
<s4>2010</s4>
</fA30>
</pR>
</standard>
<server><NO>PASCAL 10-0429694 INIST</NO>
<ET>A Neural-Linguistic Approach for the Recognition of a Wide Arabic Word Lexicon</ET>
<AU>BEN CHEIKH (I.); KACEM (A.); BELAÏD (A.); LIKFORMAN-SULEM (Laurence); AGAM (Gady)</AU>
<AF>UTIC-ESSTT, 5 Avenue Taha Hussein, BP56 Bab Menara/1008 Tunis/Tunisie (1 aut., 2 aut.); LORIA, Campus scientifique, B.P. 239/54606 Vandœuvre-Lès-Nancy/France (3 aut.)</AF>
<DT>Publication en série; Congrès; Niveau analytique</DT>
<SO>Proceedings of SPIE, the International Society for Optical Engineering; ISSN 0277-786X; Coden PSISDG; Etats-Unis; Da. 2010; Vol. 7534; 75340L.1-75340L.10; Bibl. 10 ref.</SO>
<LA>Anglais</LA>
<EA>Recently, we have investigated the use of Arabic linguistic knowledge to improve the recognition of wide Arabic word lexicon. A neural-linguistic approach was proposed to mainly deal with canonical vocabulary of decomposable words derived from tri-consonant healthy roots. The basic idea is to factorize words by their roots and schemes. In this direction, we conceived two neural networks TNN_R and TNN_S to respectively recognize roots and schemes from structural primitives of words. The proposal approach achieved promising results. In this paper, we will focus on how to reach better results in terms of accuracy and recognition rate. Current improvements concern especially the training stage. It is about 1) to benefit from word letters order 2) to consider "sisters letters" (letters having same features), 3) to supervise networks behaviors, 4) to split up neurons to save letter occurrences and 5) to solve observed ambiguities. Considering theses improvements, experiments carried on 1500 sized vocabulary show a significant enhancement: TNN_R (resp. TNN_S) top4 has gone up from 77% to 85.8% (resp. from 65% to 97.9%). Enlarging the vocabulary from 1000 to 1700, adding 100 words each time, again confirmed the results without altering the networks stability.</EA>
<CC>001B00A30C; 001B00G05M; 001B40B30S; 001D02C06</CC>
<FD>Réseau neuronal; Reconnaissance forme; Recherche documentaire; Arabe; Lexique; Vocabulaire; Consonne; Précision; Apprentissage; 0130C; 0705M; 4230S</FD>
<ED>Neural networks; Pattern recognition; Document retrieval; Arabic; Lexicon; Vocabulary; Consonants; Accuracy; Learning</ED>
<SD>Búsqueda documental; Árabe; Léxico</SD>
<LO>INIST-21760.354000174683810200</LO>
<ID>10-0429694</ID>
</server>
</inist>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Lorraine/explor/InforLorV4/Data/PascalFrancis/Corpus

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000204 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/PascalFrancis/Corpus/biblio.hfd -nk 000204 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Lorraine
   |area=    InforLorV4
   |flux=    PascalFrancis
   |étape=   Corpus
   |type=    RBID
   |clé=     Pascal:10-0429694
   |texte=   A Neural-Linguistic Approach for the Recognition of a Wide Arabic Word Lexicon
}}

This area was generated with Dilib version V0.6.33.
Data generation: Mon Jun 10 21:56:28 2019. Site generation: Fri Feb 25 15:29:27 2022

	Serveur d'exploration sur la recherche en informatique en Lorraine
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration sur la recherche en informatique en Lorraine

A Neural-Linguistic Approach for the Recognition of a Wide Arabic Word Lexicon

A Neural-Linguistic Approach for the Recognition of a Wide Arabic Word Lexicon

Source :

Descripteurs français

English descriptors

Abstract

Notice en format standard (ISO 2709)

Format Inist (serveur)

Links to Exploration step

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri