Serveur d'exploration sur les dispositifs haptiques

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

The Construction of Semantic Memory: Grammar-Based Representations Learned from Relational Episodic Information

Identifieur interne : 001D19 ( Pmc/Curation ); précédent : 001D18; suivant : 001D20

The Construction of Semantic Memory: Grammar-Based Representations Learned from Relational Episodic Information

Auteurs : Francesco P. Battaglia [Pays-Bas] ; Cyriel M. A. Pennartz [Pays-Bas]

Source :

RBID : PMC:3157741

Abstract

After acquisition, memories underlie a process of consolidation, making them more resistant to interference and brain injury. Memory consolidation involves systems-level interactions, most importantly between the hippocampus and associated structures, which takes part in the initial encoding of memory, and the neocortex, which supports long-term storage. This dichotomy parallels the contrast between episodic memory (tied to the hippocampal formation), collecting an autobiographical stream of experiences, and semantic memory, a repertoire of facts and statistical regularities about the world, involving the neocortex at large. Experimental evidence points to a gradual transformation of memories, following encoding, from an episodic to a semantic character. This may require an exchange of information between different memory modules during inactive periods. We propose a theory for such interactions and for the formation of semantic memory, in which episodic memory is encoded as relational data. Semantic memory is modeled as a modified stochastic grammar, which learns to parse episodic configurations expressed as an association matrix. The grammar produces tree-like representations of episodes, describing the relationships between its main constituents at multiple levels of categorization, based on its current knowledge of world regularities. These regularities are learned by the grammar from episodic memory information, through an expectation-maximization procedure, analogous to the inside–outside algorithm for stochastic context-free grammars. We propose that a Monte-Carlo sampling version of this algorithm can be mapped on the dynamics of “sleep replay” of previously acquired information in the hippocampus and neocortex. We propose that the model can reproduce several properties of semantic memory such as decontextualization, top-down processing, and creation of schemata.


Url:
DOI: 10.3389/fncom.2011.00036
PubMed: 21887143
PubMed Central: 3157741

Links toward previous steps (curation, corpus...)


Links to Exploration step

PMC:3157741

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">The Construction of Semantic Memory: Grammar-Based Representations Learned from Relational Episodic Information</title>
<author>
<name sortKey="Battaglia, Francesco P" sort="Battaglia, Francesco P" uniqKey="Battaglia F" first="Francesco P." last="Battaglia">Francesco P. Battaglia</name>
<affiliation wicri:level="1">
<nlm:aff id="aff1">
<institution>Center for Neuroscience, Swammerdam Institute for Life Sciences, Universiteit van Amsterdam</institution>
<country>Amsterdam, Netherlands</country>
</nlm:aff>
<country xml:lang="fr">Pays-Bas</country>
<wicri:regionArea></wicri:regionArea>
<wicri:regionArea># see nlm:aff region in country</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Pennartz, Cyriel M A" sort="Pennartz, Cyriel M A" uniqKey="Pennartz C" first="Cyriel M. A." last="Pennartz">Cyriel M. A. Pennartz</name>
<affiliation wicri:level="1">
<nlm:aff id="aff1">
<institution>Center for Neuroscience, Swammerdam Institute for Life Sciences, Universiteit van Amsterdam</institution>
<country>Amsterdam, Netherlands</country>
</nlm:aff>
<country xml:lang="fr">Pays-Bas</country>
<wicri:regionArea></wicri:regionArea>
<wicri:regionArea># see nlm:aff region in country</wicri:regionArea>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">21887143</idno>
<idno type="pmc">3157741</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3157741</idno>
<idno type="RBID">PMC:3157741</idno>
<idno type="doi">10.3389/fncom.2011.00036</idno>
<date when="2011">2011</date>
<idno type="wicri:Area/Pmc/Corpus">001D19</idno>
<idno type="wicri:Area/Pmc/Curation">001D19</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">The Construction of Semantic Memory: Grammar-Based Representations Learned from Relational Episodic Information</title>
<author>
<name sortKey="Battaglia, Francesco P" sort="Battaglia, Francesco P" uniqKey="Battaglia F" first="Francesco P." last="Battaglia">Francesco P. Battaglia</name>
<affiliation wicri:level="1">
<nlm:aff id="aff1">
<institution>Center for Neuroscience, Swammerdam Institute for Life Sciences, Universiteit van Amsterdam</institution>
<country>Amsterdam, Netherlands</country>
</nlm:aff>
<country xml:lang="fr">Pays-Bas</country>
<wicri:regionArea></wicri:regionArea>
<wicri:regionArea># see nlm:aff region in country</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Pennartz, Cyriel M A" sort="Pennartz, Cyriel M A" uniqKey="Pennartz C" first="Cyriel M. A." last="Pennartz">Cyriel M. A. Pennartz</name>
<affiliation wicri:level="1">
<nlm:aff id="aff1">
<institution>Center for Neuroscience, Swammerdam Institute for Life Sciences, Universiteit van Amsterdam</institution>
<country>Amsterdam, Netherlands</country>
</nlm:aff>
<country xml:lang="fr">Pays-Bas</country>
<wicri:regionArea></wicri:regionArea>
<wicri:regionArea># see nlm:aff region in country</wicri:regionArea>
</affiliation>
</author>
</analytic>
<series>
<title level="j">Frontiers in Computational Neuroscience</title>
<idno type="eISSN">1662-5188</idno>
<imprint>
<date when="2011">2011</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p>After acquisition, memories underlie a process of consolidation, making them more resistant to interference and brain injury. Memory consolidation involves systems-level interactions, most importantly between the hippocampus and associated structures, which takes part in the initial encoding of memory, and the neocortex, which supports long-term storage. This dichotomy parallels the contrast between episodic memory (tied to the hippocampal formation), collecting an autobiographical stream of experiences, and semantic memory, a repertoire of facts and statistical regularities about the world, involving the neocortex at large. Experimental evidence points to a gradual transformation of memories, following encoding, from an episodic to a semantic character. This may require an exchange of information between different memory modules during inactive periods. We propose a theory for such interactions and for the formation of semantic memory, in which episodic memory is encoded as relational data. Semantic memory is modeled as a modified stochastic grammar, which learns to parse episodic configurations expressed as an association matrix. The grammar produces tree-like representations of episodes, describing the relationships between its main constituents at multiple levels of categorization, based on its current knowledge of world regularities. These regularities are learned by the grammar from episodic memory information, through an expectation-maximization procedure, analogous to the inside–outside algorithm for stochastic context-free grammars. We propose that a Monte-Carlo sampling version of this algorithm can be mapped on the dynamics of “sleep replay” of previously acquired information in the hippocampus and neocortex. We propose that the model can reproduce several properties of semantic memory such as decontextualization, top-down processing, and creation of schemata.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Ambros Ingerson, J" uniqKey="Ambros Ingerson J">J. Ambros-Ingerson</name>
</author>
<author>
<name sortKey="Granger, R" uniqKey="Granger R">R. Granger</name>
</author>
<author>
<name sortKey="Lynch, G" uniqKey="Lynch G">G. Lynch</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Baldi, P" uniqKey="Baldi P">P. Baldi</name>
</author>
<author>
<name sortKey="Chauvin, Y" uniqKey="Chauvin Y">Y. Chauvin</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Battaglia, F P" uniqKey="Battaglia F">F. P. Battaglia</name>
</author>
<author>
<name sortKey="Sutherland, G R" uniqKey="Sutherland G">G. R. Sutherland</name>
</author>
<author>
<name sortKey="Mcnaughton, B L" uniqKey="Mcnaughton B">B. L. McNaughton</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bayley, P J" uniqKey="Bayley P">P. J. Bayley</name>
</author>
<author>
<name sortKey="Gold, J J" uniqKey="Gold J">J. J. Gold</name>
</author>
<author>
<name sortKey="Hopkins, R O" uniqKey="Hopkins R">R. O. Hopkins</name>
</author>
<author>
<name sortKey="Squire, L R" uniqKey="Squire L">L. R. Squire</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bi, G Q" uniqKey="Bi G">G. Q. Bi</name>
</author>
<author>
<name sortKey="Poo, M M" uniqKey="Poo M">M. M. Poo</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bod, R" uniqKey="Bod R">R. Bod</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bod, R" uniqKey="Bod R">R. Bod</name>
</author>
<author>
<name sortKey="Hay, J" uniqKey="Hay J">J. Hay</name>
</author>
<author>
<name sortKey="Jannedy, S" uniqKey="Jannedy S">S. Jannedy</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Bod, R" uniqKey="Bod R">R. Bod</name>
</author>
<author>
<name sortKey="Scha, R" uniqKey="Scha R">R. Scha</name>
</author>
<author>
<name sortKey="Sima An, K" uniqKey="Sima An K">K. Sima'an</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Borensztajn, G" uniqKey="Borensztajn G">G. Borensztajn</name>
</author>
<author>
<name sortKey="Zuidema, W" uniqKey="Zuidema W">W. Zuidema</name>
</author>
<author>
<name sortKey="Bod, R" uniqKey="Bod R">R. Bod</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Chater, N" uniqKey="Chater N">N. Chater</name>
</author>
<author>
<name sortKey="Tenenbaum, J B" uniqKey="Tenenbaum J">J. B. Tenenbaum</name>
</author>
<author>
<name sortKey="Yuille, A" uniqKey="Yuille A">A. Yuille</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Cheng, S" uniqKey="Cheng S">S. Cheng</name>
</author>
<author>
<name sortKey="Frank, L M" uniqKey="Frank L">L. M. Frank</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Cohen, N J" uniqKey="Cohen N">N. J. Cohen</name>
</author>
<author>
<name sortKey="Eichenbaum, H" uniqKey="Eichenbaum H">H. Eichenbaum</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Courville, A C" uniqKey="Courville A">A. C. Courville</name>
</author>
<author>
<name sortKey="Daw, N D" uniqKey="Daw N">N. D. Daw</name>
</author>
<author>
<name sortKey="Touretzky, D S" uniqKey="Touretzky D">D. S. Touretzky</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Deregnaucourt, S" uniqKey="Deregnaucourt S">S. Derégnaucourt</name>
</author>
<author>
<name sortKey="Mitra, P P" uniqKey="Mitra P">P. P. Mitra</name>
</author>
<author>
<name sortKey="Feher, O" uniqKey="Feher O">O. Fehér</name>
</author>
<author>
<name sortKey="Pytte, C" uniqKey="Pytte C">C. Pytte</name>
</author>
<author>
<name sortKey="Tchernichovski, O" uniqKey="Tchernichovski O">O. Tchernichovski</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Dienes, Z" uniqKey="Dienes Z">Z. Dienes</name>
</author>
<author>
<name sortKey="Altmann, G T" uniqKey="Altmann G">G. T. Altmann</name>
</author>
<author>
<name sortKey="Gao, S J J" uniqKey="Gao S">S.-J. J. Gao</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ellenbogen, J M" uniqKey="Ellenbogen J">J. M. Ellenbogen</name>
</author>
<author>
<name sortKey="Hu, P T" uniqKey="Hu P">P. T. Hu</name>
</author>
<author>
<name sortKey="Payne, J D" uniqKey="Payne J">J. D. Payne</name>
</author>
<author>
<name sortKey="Titone, D" uniqKey="Titone D">D. Titone</name>
</author>
<author>
<name sortKey="Walker, M P" uniqKey="Walker M">M. P. Walker</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ernst, M O" uniqKey="Ernst M">M. O. Ernst</name>
</author>
<author>
<name sortKey="Banks, M S" uniqKey="Banks M">M. S. Banks</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Euston, D R" uniqKey="Euston D">D. R. Euston</name>
</author>
<author>
<name sortKey="Tatsuno, M" uniqKey="Tatsuno M">M. Tatsuno</name>
</author>
<author>
<name sortKey="Mcnaughton, B L" uniqKey="Mcnaughton B">B. L. McNaughton</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Felleman, D J" uniqKey="Felleman D">D. J. Felleman</name>
</author>
<author>
<name sortKey="Van Essen, D C" uniqKey="Van Essen D">D. C. Van Essen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Frankland, P W" uniqKey="Frankland P">P. W. Frankland</name>
</author>
<author>
<name sortKey="Bontempi, B" uniqKey="Bontempi B">B. Bontempi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Friston, K" uniqKey="Friston K">K. Friston</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="George, D" uniqKey="George D">D. George</name>
</author>
<author>
<name sortKey="Hawkins, J" uniqKey="Hawkins J">J. Hawkins</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="G Mez, R L" uniqKey="G Mez R">R. L. Gómez</name>
</author>
<author>
<name sortKey="Bootzin, R R" uniqKey="Bootzin R">R. R. Bootzin</name>
</author>
<author>
<name sortKey="Nadel, L" uniqKey="Nadel L">L. Nadel</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Goodman, J" uniqKey="Goodman J">J. Goodman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Goodman, N D" uniqKey="Goodman N">N. D. Goodman</name>
</author>
<author>
<name sortKey="Tenenbaum, J B" uniqKey="Tenenbaum J">J. B. Tenenbaum</name>
</author>
<author>
<name sortKey="Feldman, J" uniqKey="Feldman J">J. Feldman</name>
</author>
<author>
<name sortKey="Griffiths, T L" uniqKey="Griffiths T">T. L. Griffiths</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hinton, G E" uniqKey="Hinton G">G. E. Hinton</name>
</author>
<author>
<name sortKey="Sejnowski, T J" uniqKey="Sejnowski T">T. J. Sejnowski</name>
</author>
<author>
<name sortKey="Ackley, D H" uniqKey="Ackley D">D. H. Ackley</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hoffman, K L" uniqKey="Hoffman K">K. L. Hoffman</name>
</author>
<author>
<name sortKey="Mcnaughton, B L" uniqKey="Mcnaughton B">B. L. McNaughton</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hopfield, J J" uniqKey="Hopfield J">J. J. Hopfield</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Isomura, Y" uniqKey="Isomura Y">Y. Isomura</name>
</author>
<author>
<name sortKey="Sirota, A" uniqKey="Sirota A">A. Sirota</name>
</author>
<author>
<name sortKey="Ozen, S" uniqKey="Ozen S">S. Ozen</name>
</author>
<author>
<name sortKey="Montgomery, S" uniqKey="Montgomery S">S. Montgomery</name>
</author>
<author>
<name sortKey="Mizuseki, K" uniqKey="Mizuseki K">K. Mizuseki</name>
</author>
<author>
<name sortKey="Henze, D A" uniqKey="Henze D">D. A. Henze</name>
</author>
<author>
<name sortKey="Buzsaki, G" uniqKey="Buzsaki G">G. Buzsáki</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ji, D" uniqKey="Ji D">D. Ji</name>
</author>
<author>
<name sortKey="Wilson, M A" uniqKey="Wilson M">M. A. Wilson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kahana, M J" uniqKey="Kahana M">M. J. Kahana</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kali, S" uniqKey="Kali S">S. Kali</name>
</author>
<author>
<name sortKey="Dayan, P" uniqKey="Dayan P">P. Dayan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kemp, C" uniqKey="Kemp C">C. Kemp</name>
</author>
<author>
<name sortKey="Tenenbaum, J B" uniqKey="Tenenbaum J">J. B. Tenenbaum</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kim, J J" uniqKey="Kim J">J. J. Kim</name>
</author>
<author>
<name sortKey="Fanselow, M S" uniqKey="Fanselow M">M. S. Fanselow</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ko Rding, K P" uniqKey="Ko Rding K">K. P. Koőrding</name>
</author>
<author>
<name sortKey="Wolpert, D M" uniqKey="Wolpert D">D. M. Wolpert</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kudrimoti, H S" uniqKey="Kudrimoti H">H. S. Kudrimoti</name>
</author>
<author>
<name sortKey="Barnes, C A" uniqKey="Barnes C">C. A. Barnes</name>
</author>
<author>
<name sortKey="Mcnaughton, B L" uniqKey="Mcnaughton B">B. L. McNaughton</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lari, K" uniqKey="Lari K">K. Lari</name>
</author>
<author>
<name sortKey="Young, S J" uniqKey="Young S">S. J. Young</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lee, A K" uniqKey="Lee A">A. K. Lee</name>
</author>
<author>
<name sortKey="Wilson, M A" uniqKey="Wilson M">M. A. Wilson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lee, T S" uniqKey="Lee T">T. S. Lee</name>
</author>
<author>
<name sortKey="Mumford, D" uniqKey="Mumford D">D. Mumford</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ljungberg, T" uniqKey="Ljungberg T">T. Ljungberg</name>
</author>
<author>
<name sortKey="Apicella, P" uniqKey="Apicella P">P. Apicella</name>
</author>
<author>
<name sortKey="Schultz, W" uniqKey="Schultz W">W. Schultz</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Manning, C" uniqKey="Manning C">C. Manning</name>
</author>
<author>
<name sortKey="Schutze, H" uniqKey="Schutze H">H. Schütze</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Markram, H" uniqKey="Markram H">H. Markram</name>
</author>
<author>
<name sortKey="Lubke, J" uniqKey="Lubke J">J. Lubke</name>
</author>
<author>
<name sortKey="Frotscher, M" uniqKey="Frotscher M">M. Frotscher</name>
</author>
<author>
<name sortKey="Sakmann, B" uniqKey="Sakmann B">B. Sakmann</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Marr, D" uniqKey="Marr D">D. Marr</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Maviel, T" uniqKey="Maviel T">T. Maviel</name>
</author>
<author>
<name sortKey="Durkin, T P" uniqKey="Durkin T">T. P. Durkin</name>
</author>
<author>
<name sortKey="Menzaghi, F" uniqKey="Menzaghi F">F. Menzaghi</name>
</author>
<author>
<name sortKey="Bontempi, B" uniqKey="Bontempi B">B. Bontempi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mcclelland, J L" uniqKey="Mcclelland J">J. L. McClelland</name>
</author>
<author>
<name sortKey="Mcnaughton, B L" uniqKey="Mcnaughton B">B. L. McNaughton</name>
</author>
<author>
<name sortKey="O Reilly, R C" uniqKey="O Reilly R">R. C. O'Reilly</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mcnaughton, B L" uniqKey="Mcnaughton B">B. L. McNaughton</name>
</author>
<author>
<name sortKey="Barnes, C A" uniqKey="Barnes C">C. A. Barnes</name>
</author>
<author>
<name sortKey="Battaglia, F P" uniqKey="Battaglia F">F. P. Battaglia</name>
</author>
<author>
<name sortKey="Bower, M R" uniqKey="Bower M">M. R. Bower</name>
</author>
<author>
<name sortKey="Cowen, S L" uniqKey="Cowen S">S. L. Cowen</name>
</author>
<author>
<name sortKey="Ekstrom, A D" uniqKey="Ekstrom A">A. D. Ekstrom</name>
</author>
<author>
<name sortKey="Gerrard, J L" uniqKey="Gerrard J">J. L. Gerrard</name>
</author>
<author>
<name sortKey="Hoffman, K L" uniqKey="Hoffman K">K. L. Hoffman</name>
</author>
<author>
<name sortKey="Houston, F P" uniqKey="Houston F">F. P. Houston</name>
</author>
<author>
<name sortKey="Karten, Y" uniqKey="Karten Y">Y. Karten</name>
</author>
<author>
<name sortKey="Lipa, P" uniqKey="Lipa P">P. Lipa</name>
</author>
<author>
<name sortKey="Pennartz, C M" uniqKey="Pennartz C">C. M. Pennartz</name>
</author>
<author>
<name sortKey="Sutherland, G R" uniqKey="Sutherland G">G. R. Sutherland</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mcnaughton, B L" uniqKey="Mcnaughton B">B. L. McNaughton</name>
</author>
<author>
<name sortKey="Morris, R G M" uniqKey="Morris R">R. G. M. Morris</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mesulam, M M" uniqKey="Mesulam M">M. M. Mesulam</name>
</author>
<author>
<name sortKey="Mufson, E J" uniqKey="Mufson E">E. J. Mufson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Moscovitch, M" uniqKey="Moscovitch M">M. Moscovitch</name>
</author>
<author>
<name sortKey="Nadel, L" uniqKey="Nadel L">L. Nadel</name>
</author>
<author>
<name sortKey="Winocur, G" uniqKey="Winocur G">G. Winocur</name>
</author>
<author>
<name sortKey="Gilboa, A" uniqKey="Gilboa A">A. Gilboa</name>
</author>
<author>
<name sortKey="Rosenbaum, R S" uniqKey="Rosenbaum R">R. S. Rosenbaum</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Moscovitch, M" uniqKey="Moscovitch M">M. Moscovitch</name>
</author>
<author>
<name sortKey="Rosenbaum, R S" uniqKey="Rosenbaum R">R. S. Rosenbaum</name>
</author>
<author>
<name sortKey="Gilboa, A" uniqKey="Gilboa A">A. Gilboa</name>
</author>
<author>
<name sortKey="Addis, D R" uniqKey="Addis D">D. R. Addis</name>
</author>
<author>
<name sortKey="Westmacott, R" uniqKey="Westmacott R">R. Westmacott</name>
</author>
<author>
<name sortKey="Grady, C" uniqKey="Grady C">C. Grady</name>
</author>
<author>
<name sortKey="Mcandrews, M P" uniqKey="Mcandrews M">M. P. McAndrews</name>
</author>
<author>
<name sortKey="Levine, B" uniqKey="Levine B">B. Levine</name>
</author>
<author>
<name sortKey="Black, S" uniqKey="Black S">S. Black</name>
</author>
<author>
<name sortKey="Winocur, G" uniqKey="Winocur G">G. Winocur</name>
</author>
<author>
<name sortKey="Nadel, L" uniqKey="Nadel L">L. Nadel</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Neal, R M" uniqKey="Neal R">R. M. Neal</name>
</author>
<author>
<name sortKey="Hinton, G E" uniqKey="Hinton G">G. E. Hinton</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="O Neill, J" uniqKey="O Neill J">J. O'Neill</name>
</author>
<author>
<name sortKey="Senior, T J" uniqKey="Senior T">T. J. Senior</name>
</author>
<author>
<name sortKey="Allen, K" uniqKey="Allen K">K. Allen</name>
</author>
<author>
<name sortKey="Huxter, J R" uniqKey="Huxter J">J. R. Huxter</name>
</author>
<author>
<name sortKey="Csicsvari, J" uniqKey="Csicsvari J">J. Csicsvari</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Orban, G" uniqKey="Orban G">G. Orbán</name>
</author>
<author>
<name sortKey="Fiser, J" uniqKey="Fiser J">J. Fiser</name>
</author>
<author>
<name sortKey="Aslin, R N" uniqKey="Aslin R">R. N. Aslin</name>
</author>
<author>
<name sortKey="Lengyel, M" uniqKey="Lengyel M">M. Lengyel</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Peyrache, A" uniqKey="Peyrache A">A. Peyrache</name>
</author>
<author>
<name sortKey="Khamassi, M" uniqKey="Khamassi M">M. Khamassi</name>
</author>
<author>
<name sortKey="Benchenane, K" uniqKey="Benchenane K">K. Benchenane</name>
</author>
<author>
<name sortKey="Wiener, S I" uniqKey="Wiener S">S. I. Wiener</name>
</author>
<author>
<name sortKey="Battaglia, F P" uniqKey="Battaglia F">F. P. Battaglia</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Quillian, M R" uniqKey="Quillian M">M. R. Quillian</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rabiner, L R" uniqKey="Rabiner L">L. R. Rabiner</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rao, R P" uniqKey="Rao R">R. P. Rao</name>
</author>
<author>
<name sortKey="Ballard, D H" uniqKey="Ballard D">D. H. Ballard</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rasch, B" uniqKey="Rasch B">B. Rasch</name>
</author>
<author>
<name sortKey="Born, J" uniqKey="Born J">J. Born</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Raven, J" uniqKey="Raven J">J. Raven</name>
</author>
<author>
<name sortKey="Raven, J C" uniqKey="Raven J">J. C. Raven</name>
</author>
<author>
<name sortKey="Court, J H" uniqKey="Court J">J. H. Court</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rogers, T T" uniqKey="Rogers T">T. T. Rogers</name>
</author>
<author>
<name sortKey="Mcclelland, J L" uniqKey="Mcclelland J">J. L. McClelland</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rohlf, F J" uniqKey="Rohlf F">F. J. Rohlf</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Savova, V" uniqKey="Savova V">V. Savova</name>
</author>
<author>
<name sortKey="Jakel, F" uniqKey="Jakel F">F. Jakel</name>
</author>
<author>
<name sortKey="Tenenbaum, J B" uniqKey="Tenenbaum J">J. B. Tenenbaum</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Savova, V" uniqKey="Savova V">V. Savova</name>
</author>
<author>
<name sortKey="Tenenbaum, J B" uniqKey="Tenenbaum J">J. B. Tenenbaum</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Schabus, M" uniqKey="Schabus M">M. Schabus</name>
</author>
<author>
<name sortKey="Hodlmoser, K" uniqKey="Hodlmoser K">K. Hödlmoser</name>
</author>
<author>
<name sortKey="Gruber, G" uniqKey="Gruber G">G. Gruber</name>
</author>
<author>
<name sortKey="Sauter, C" uniqKey="Sauter C">C. Sauter</name>
</author>
<author>
<name sortKey="Anderer, P" uniqKey="Anderer P">P. Anderer</name>
</author>
<author>
<name sortKey="Klosch, G" uniqKey="Klosch G">G. Klösch</name>
</author>
<author>
<name sortKey="Parapatics, S" uniqKey="Parapatics S">S. Parapatics</name>
</author>
<author>
<name sortKey="Saletu, B" uniqKey="Saletu B">B. Saletu</name>
</author>
<author>
<name sortKey="Klimesch, W" uniqKey="Klimesch W">W. Klimesch</name>
</author>
<author>
<name sortKey="Zeitlhofer, J" uniqKey="Zeitlhofer J">J. Zeitlhofer</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Scoville, W B" uniqKey="Scoville W">W. B. Scoville</name>
</author>
<author>
<name sortKey="Milner, B" uniqKey="Milner B">B. Milner</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Shastri, L" uniqKey="Shastri L">L. Shastri</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Shen, B" uniqKey="Shen B">B. Shen</name>
</author>
<author>
<name sortKey="Mcnaughton, B L" uniqKey="Mcnaughton B">B. L. McNaughton</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Siapas, A G" uniqKey="Siapas A">A. G. Siapas</name>
</author>
<author>
<name sortKey="Wilson, M A" uniqKey="Wilson M">M. A. Wilson</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Simoncelli, E P" uniqKey="Simoncelli E">E. P. Simoncelli</name>
</author>
<author>
<name sortKey="Olshausen, B A" uniqKey="Olshausen B">B. A. Olshausen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sirota, A" uniqKey="Sirota A">A. Sirota</name>
</author>
<author>
<name sortKey="Csicsvari, J" uniqKey="Csicsvari J">J. Csicsvari</name>
</author>
<author>
<name sortKey="Buhl, D" uniqKey="Buhl D">D. Buhl</name>
</author>
<author>
<name sortKey="Buzsaki, G" uniqKey="Buzsaki G">G. Buzsáki</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Skaggs, W E" uniqKey="Skaggs W">W. E. Skaggs</name>
</author>
<author>
<name sortKey="Mcnaughton, B L" uniqKey="Mcnaughton B">B. L. McNaughton</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Squire, L R" uniqKey="Squire L">L. R. Squire</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Stolcke, A" uniqKey="Stolcke A">A. Stolcke</name>
</author>
<author>
<name sortKey="Omohundro, S" uniqKey="Omohundro S">S. Omohundro</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sutherland, R J" uniqKey="Sutherland R">R. J. Sutherland</name>
</author>
<author>
<name sortKey="Rudy, J W" uniqKey="Rudy J">J. W. Rudy</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Takashima, A" uniqKey="Takashima A">A. Takashima</name>
</author>
<author>
<name sortKey="Petersson, K M" uniqKey="Petersson K">K. M. Petersson</name>
</author>
<author>
<name sortKey="Rutters, F" uniqKey="Rutters F">F. Rutters</name>
</author>
<author>
<name sortKey="Tendolkar, I" uniqKey="Tendolkar I">I. Tendolkar</name>
</author>
<author>
<name sortKey="Jensen, O" uniqKey="Jensen O">O. Jensen</name>
</author>
<author>
<name sortKey="Zwarts, M J" uniqKey="Zwarts M">M. J. Zwarts</name>
</author>
<author>
<name sortKey="Mcnaughton, B L" uniqKey="Mcnaughton B">B. L. McNaughton</name>
</author>
<author>
<name sortKey="Fernandez, G" uniqKey="Fernandez G">G. Fernández</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Teyler, T J" uniqKey="Teyler T">T. J. Teyler</name>
</author>
<author>
<name sortKey="Discenna, P" uniqKey="Discenna P">P. DiScenna</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tomasello, M" uniqKey="Tomasello M">M. Tomasello</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Treves, A" uniqKey="Treves A">A. Treves</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Treves, A" uniqKey="Treves A">A. Treves</name>
</author>
<author>
<name sortKey="Rolls, E T" uniqKey="Rolls E">E. T. Rolls</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tse, D" uniqKey="Tse D">D. Tse</name>
</author>
<author>
<name sortKey="Langston, R F" uniqKey="Langston R">R. F. Langston</name>
</author>
<author>
<name sortKey="Kakeyama, M" uniqKey="Kakeyama M">M. Kakeyama</name>
</author>
<author>
<name sortKey="Bethus, I" uniqKey="Bethus I">I. Bethus</name>
</author>
<author>
<name sortKey="Spooner, P A" uniqKey="Spooner P">P. A. Spooner</name>
</author>
<author>
<name sortKey="Wood, E R" uniqKey="Wood E">E. R. Wood</name>
</author>
<author>
<name sortKey="Witter, M P" uniqKey="Witter M">M. P. Witter</name>
</author>
<author>
<name sortKey="Morris, R G M" uniqKey="Morris R">R. G. M. Morris</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tulving, E" uniqKey="Tulving E">E. Tulving</name>
</author>
<author>
<name sortKey="Craik, F I M" uniqKey="Craik F">F. I. M. Craik</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ullman, S" uniqKey="Ullman S">S. Ullman</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Valdes, J L" uniqKey="Valdes J">J. L. Valdes</name>
</author>
<author>
<name sortKey="Mcnaughton, B L" uniqKey="Mcnaughton B">B. L. McNaughton</name>
</author>
<author>
<name sortKey="Fellous, J M" uniqKey="Fellous J">J. M. Fellous</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wagner, U" uniqKey="Wagner U">U. Wagner</name>
</author>
<author>
<name sortKey="Gais, S" uniqKey="Gais S">S. Gais</name>
</author>
<author>
<name sortKey="Haider, H" uniqKey="Haider H">H. Haider</name>
</author>
<author>
<name sortKey="Verleger, R" uniqKey="Verleger R">R. Verleger</name>
</author>
<author>
<name sortKey="Born, J" uniqKey="Born J">J. Born</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wei, G C G" uniqKey="Wei G">G. C. G. Wei</name>
</author>
<author>
<name sortKey="Tanner, M A" uniqKey="Tanner M">M. A. Tanner</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wilson, M A" uniqKey="Wilson M">M. A. Wilson</name>
</author>
<author>
<name sortKey="Mcnaughton, B L" uniqKey="Mcnaughton B">B. L. McNaughton</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Winocur, G" uniqKey="Winocur G">G. Winocur</name>
</author>
<author>
<name sortKey="Moscovitch, M" uniqKey="Moscovitch M">M. Moscovitch</name>
</author>
<author>
<name sortKey="Sekeres, M" uniqKey="Sekeres M">M. Sekeres</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Yordanova, J" uniqKey="Yordanova J">J. Yordanova</name>
</author>
<author>
<name sortKey="Kolev, V" uniqKey="Kolev V">V. Kolev</name>
</author>
<author>
<name sortKey="Wagner, U" uniqKey="Wagner U">U. Wagner</name>
</author>
<author>
<name sortKey="Verleger, R" uniqKey="Verleger R">R. Verleger</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Yu, A J" uniqKey="Yu A">A. J. Yu</name>
</author>
<author>
<name sortKey="Dayan, P" uniqKey="Dayan P">P. Dayan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zaborszky, L" uniqKey="Zaborszky L">L. Zaborszky</name>
</author>
<author>
<name sortKey="Gaykema, R P" uniqKey="Gaykema R">R. P. Gaykema</name>
</author>
<author>
<name sortKey="Swanson, D J" uniqKey="Swanson D">D. J. Swanson</name>
</author>
<author>
<name sortKey="Cullinan, W E" uniqKey="Cullinan W">W. E. Cullinan</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zhu, L" uniqKey="Zhu L">L. Zhu</name>
</author>
<author>
<name sortKey="Chen, Y" uniqKey="Chen Y">Y. Chen</name>
</author>
<author>
<name sortKey="Yuille, A" uniqKey="Yuille A">A. Yuille</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Zola Morgan, S M" uniqKey="Zola Morgan S">S. M. Zola-Morgan</name>
</author>
<author>
<name sortKey="Squire, L R" uniqKey="Squire L">L. R. Squire</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">Front Comput Neurosci</journal-id>
<journal-id journal-id-type="publisher-id">Front. Comput. Neurosci.</journal-id>
<journal-title-group>
<journal-title>Frontiers in Computational Neuroscience</journal-title>
</journal-title-group>
<issn pub-type="epub">1662-5188</issn>
<publisher>
<publisher-name>Frontiers Research Foundation</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">21887143</article-id>
<article-id pub-id-type="pmc">3157741</article-id>
<article-id pub-id-type="doi">10.3389/fncom.2011.00036</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Neuroscience</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>The Construction of Semantic Memory: Grammar-Based Representations Learned from Relational Episodic Information</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Battaglia</surname>
<given-names>Francesco P.</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="author-notes" rid="fn001">*</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Pennartz</surname>
<given-names>Cyriel M. A.</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
</contrib-group>
<aff id="aff1">
<sup>1</sup>
<institution>Center for Neuroscience, Swammerdam Institute for Life Sciences, Universiteit van Amsterdam</institution>
<country>Amsterdam, Netherlands</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>Edited by: Stefano Fusi, Columbia University, USA</p>
</fn>
<fn fn-type="edited-by">
<p>Reviewed by: Mate Lengyel, University of Cambridge, UK; Timothy T. Rogers, University of Wisconsin-Madison, USA</p>
</fn>
<corresp id="fn001">*Correspondence: Francesco P. Battaglia, Center for Neuroscience, Swammerdam Institute for Life Sciences, Universiteit van Amsterdam, Postbus 94246, 1090GE Amsterdam, Netherlands. e-mail:
<email>f.p.battaglia@uva.nl</email>
</corresp>
</author-notes>
<pub-date pub-type="epub">
<day>18</day>
<month>8</month>
<year>2011</year>
</pub-date>
<pub-date pub-type="collection">
<year>2011</year>
</pub-date>
<volume>5</volume>
<elocation-id>36</elocation-id>
<history>
<date date-type="received">
<day>23</day>
<month>2</month>
<year>2010</year>
</date>
<date date-type="accepted">
<day>29</day>
<month>7</month>
<year>2011</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright © 2011 Battaglia and Pennartz.</copyright-statement>
<copyright-year>2011</copyright-year>
<license license-type="open-access" xlink:href="http://www.frontiersin.org/licenseagreement">
<license-p>This is an open-access article subject to a non-exclusive license between the authors and Frontiers Media SA, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and other Frontiers conditions are complied with.</license-p>
</license>
</permissions>
<abstract>
<p>After acquisition, memories underlie a process of consolidation, making them more resistant to interference and brain injury. Memory consolidation involves systems-level interactions, most importantly between the hippocampus and associated structures, which takes part in the initial encoding of memory, and the neocortex, which supports long-term storage. This dichotomy parallels the contrast between episodic memory (tied to the hippocampal formation), collecting an autobiographical stream of experiences, and semantic memory, a repertoire of facts and statistical regularities about the world, involving the neocortex at large. Experimental evidence points to a gradual transformation of memories, following encoding, from an episodic to a semantic character. This may require an exchange of information between different memory modules during inactive periods. We propose a theory for such interactions and for the formation of semantic memory, in which episodic memory is encoded as relational data. Semantic memory is modeled as a modified stochastic grammar, which learns to parse episodic configurations expressed as an association matrix. The grammar produces tree-like representations of episodes, describing the relationships between its main constituents at multiple levels of categorization, based on its current knowledge of world regularities. These regularities are learned by the grammar from episodic memory information, through an expectation-maximization procedure, analogous to the inside–outside algorithm for stochastic context-free grammars. We propose that a Monte-Carlo sampling version of this algorithm can be mapped on the dynamics of “sleep replay” of previously acquired information in the hippocampus and neocortex. We propose that the model can reproduce several properties of semantic memory such as decontextualization, top-down processing, and creation of schemata.</p>
</abstract>
<kwd-group>
<kwd>stochastic grammars</kwd>
<kwd>memory consolidation</kwd>
<kwd>sleep replay</kwd>
<kwd>episodic memory</kwd>
</kwd-group>
<counts>
<fig-count count="8"></fig-count>
<table-count count="0"></table-count>
<equation-count count="49"></equation-count>
<ref-count count="92"></ref-count>
<page-count count="22"></page-count>
<word-count count="17079"></word-count>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="introduction">
<label>1</label>
<title>Introduction</title>
<p>Semantic memory is a repertoire of “facts” about the world (Quillian,
<xref ref-type="bibr" rid="B55">1968</xref>
; Rogers and McClelland,
<xref ref-type="bibr" rid="B60">2004</xref>
), extracted from the analysis of statistical regularities and repeated occurrences in our experience. The brain stores information about the statistics of the environment at all scales of complexity: in the sensory system, this knowledge lies at the basis of correctly interpreting our perception and making predictions about future occurrences (see, e.g., Simoncelli and Olshausen,
<xref ref-type="bibr" rid="B69">2001</xref>
). The same thing happens at higher cognitive levels, where relationships between objects and concepts, for example cause, similarity, and co-occurrence, must be learned and organized. Semantic memory is a highly structured system of information “learned inductively from the sparse and noisy data of an uncertain world” (Goodman et al.,
<xref ref-type="bibr" rid="B25">2008</xref>
). Recently, several structured probabilistic models have been proposed that are rich enough to represent semantic memory in its intricacies (Chater et al.,
<xref ref-type="bibr" rid="B10">2006</xref>
; Kemp and Tenenbaum,
<xref ref-type="bibr" rid="B33">2008</xref>
). In the field of Computational linguistics (Manning and Schütze,
<xref ref-type="bibr" rid="B41">1999</xref>
; Bod,
<xref ref-type="bibr" rid="B6">2002</xref>
; Bod et al.,
<xref ref-type="bibr" rid="B7">2003</xref>
), many of these structured models have been devised to deal with language, which rivals in complexity with semantic knowledge.</p>
<p>Within declarative memory, however, experience is first stored in a different subsystem: episodic memory, that is, an autobiographical stream (Tulving and Craik,
<xref ref-type="bibr" rid="B81">2000</xref>
) rich in contextual information. Some theorists (Sutherland and Rudy,
<xref ref-type="bibr" rid="B74">1989</xref>
; Cohen and Eichenbaum,
<xref ref-type="bibr" rid="B12">1993</xref>
; Shastri,
<xref ref-type="bibr" rid="B66">2002</xref>
) have proposed that episodic memory stores relational information, that is, the degrees of associations between the different components of single experience and generalizing across them. On the other hand, semantic memory constitutes a knowledge repository, spanning multiple episodes. Semantic memories are structured in such a way that they can be flexibly retrieved, combined, and integrated with new incoming data.</p>
<p>In the brain, semantic and episodic memory have at least partly distinct anatomical bases, respectively in the neocortex and in the medial temporal lobe (MTL; Scoville and Milner,
<xref ref-type="bibr" rid="B65">1957</xref>
; Squire,
<xref ref-type="bibr" rid="B72">1982</xref>
; Moscovitch et al.,
<xref ref-type="bibr" rid="B50">2005</xref>
). The MTL, and most prominently the hippocampus, are considered the critical store of newly formed declarative memories. These two subsystems interact intensively (Teyler and DiScenna,
<xref ref-type="bibr" rid="B76">1986</xref>
): at acquisition, cortical semantic representations may be referred to by “pointers” in the episodic configuration stored by the hippocampus (McNaughton et al.,
<xref ref-type="bibr" rid="B46">2002</xref>
). After acquisition, information about episodic memories is gradually transferred to the neocortex (Zola-Morgan and Squire,
<xref ref-type="bibr" rid="B92">1990</xref>
; Kim and Fanselow,
<xref ref-type="bibr" rid="B34">1992</xref>
; Maviel et al.,
<xref ref-type="bibr" rid="B44">2004</xref>
; Takashima et al.,
<xref ref-type="bibr" rid="B75">2006</xref>
; Tse et al.,
<xref ref-type="bibr" rid="B80">2007</xref>
), in the process named
<italic>systems consolidation</italic>
(Frankland and Bontempi,
<xref ref-type="bibr" rid="B20">2005</xref>
). This transfer of information may be supported by hippocampal/neocortical communication and the spontaneous, coherent reactivation of neural activity configurations (Wilson and McNaughton,
<xref ref-type="bibr" rid="B86">1994</xref>
; Siapas and Wilson,
<xref ref-type="bibr" rid="B68">1998</xref>
; Kudrimoti et al.,
<xref ref-type="bibr" rid="B36">1999</xref>
; Hoffman and McNaughton,
<xref ref-type="bibr" rid="B27">2002</xref>
; Sirota et al.,
<xref ref-type="bibr" rid="B70">2003</xref>
; Battaglia et al.,
<xref ref-type="bibr" rid="B3">2004</xref>
; Isomura et al.,
<xref ref-type="bibr" rid="B29">2006</xref>
; Ji and Wilson,
<xref ref-type="bibr" rid="B30">2007</xref>
; Rasch and Born,
<xref ref-type="bibr" rid="B58">2008</xref>
; Peyrache et al.,
<xref ref-type="bibr" rid="B54">2009</xref>
). Further, data from human and animal studies support the view that systems consolidation is not just a mere relocation of memories, but includes a rearrangement of the content of memory according to the organizational principles of episodic memory: in consolidation, memories lose contextual information (Winocur et al.,
<xref ref-type="bibr" rid="B87">2007</xref>
), but they gain in flexibility. For example, memories consolidated during sleep enable “insight,” or the discovery of hidden statistical structure (Wagner et al.,
<xref ref-type="bibr" rid="B84">2004</xref>
; Ellenbogen et al.,
<xref ref-type="bibr" rid="B16">2007</xref>
). Such hidden correlations could not be inferred from the analysis of any single episode, and their discovery requires accumulation of evidence across multiple occurrences. Consolidated memories provide a schema, which facilitates the learning and storage of new information of the same kind, so that similar memories consolidate and transition to a hippocampus-independent state faster, as shown in rodents by Tse et al. (
<xref ref-type="bibr" rid="B80">2007</xref>
). In human infants, similar effects were observed in artificial grammar learning (Gómez et al.,
<xref ref-type="bibr" rid="B23">2006</xref>
).</p>
<p>So far, theories of memory consolidation and semantic memory formation in the brain have made use of connectionist approaches (McClelland et al.,
<xref ref-type="bibr" rid="B45">1995</xref>
) or unstructured unsupervised learning schemes (Kali and Dayan,
<xref ref-type="bibr" rid="B32">2004</xref>
). These models, however, can only represent semantic information in a very limited way, usually only for the particular task they were designed for. On the other hand, an application of structured probabilistic models to brain dynamics has hardly been attempted. We present here a novel theory of the interactions between episodic and semantic memory, inspired by Computational Linguistics (Manning and Schütze,
<xref ref-type="bibr" rid="B41">1999</xref>
; Bod,
<xref ref-type="bibr" rid="B6">2002</xref>
; Bod et al.,
<xref ref-type="bibr" rid="B7">2003</xref>
) where semantic memory is represented as a stochastic context-free grammar (SCFG), which is ideally suited to represent relationships between concepts in a hierarchy of complexity, as “parsing trees.” This SCFG is trained from episodic information, encoded in association matrices encoding such relationships. Once trained, the SCFG becomes a generative model, contructing episodes that are “likely” based on past experience. The generative model can be used for Bayesian inference on new episodes, and to make predictions about non-observed data. With analytical methods and numerical experiments, we show that the modified SCFG can learn to represent regularities present in more complex constructs than uni-dimensional sequences that are typically studied in computational linguistics. These constructs, which we identify with episodes, are sets completely determined by the identity of the member items, and by their pairwise associations. Pairwise associations determine the hierarchical grouping within the episode, as expressed by parsing trees. Further, we show that the learning algorithm can be expressed in a fully localist form, enabling mapping to biological neural systems. In a neural network interpretation, pairwise associations propagate in the network, to units representing higher-order nodes in parsing trees, and they are envisioned to be carried by correlations between the spike trains of different units. With simple simulations, we show that this model has several properties providing it with the potential to mimic aspects of semantic memory. Importantly, the complex Expectation-Maximization (EM) algorithm needed to train the grammar model can be expressed as a Monte-Carlo estimation, presenting suggestive analogies with hippocampal replay of neural patterns related to previous experience during sleep.</p>
</sec>
<sec sec-type="materials|methods" id="s1">
<label>2</label>
<title>Materials and Methods</title>
<sec>
<label>2.1</label>
<title>Relational codes for episodic memory</title>
<p>In this model, we concentrate on the interaction between an
<italic>episodic</italic>
memory module and a
<italic>semantic</italic>
memory module roughly corresponding, respectively, to the function of the MTL and the neocortex. This interaction takes place at the time of memory acquisition, and during consolidation. In this context, we focus on this interaction aspect of (systems) memory consolidation, as defined in this and the following sections.</p>
<p>The episodic memory module contains representations of observations, or episodes. In this framework, each episode (indicated by
<bold>O</bold>
<sup>(
<italic>n</italic>
)</sup>
for the
<italic>n</italic>
-th episode) is seen as a set of objects (agents, actions, environmental cues, etc.). Thus, the vectors
<bold>O</bold>
<sup>(
<italic>n</italic>
)</sup>
for all episodes induce a joint probability distribution on the co-occurrence of multiple items, from which correlation at all orders (i.e., higher than second order) can be computed. Higher-order correlations have indeed been shown to affect the way humans and animals process complex stimulus configurations (Courville et al.,
<xref ref-type="bibr" rid="B13">2006</xref>
; Orbán et al.,
<xref ref-type="bibr" rid="B53">2008</xref>
). This rich correlational structure is augmented by a pairwise
<italic>episodic association</italic>
matrix
<italic>s
<sup>O</sup>
</italic>
(
<italic>i</italic>
,
<italic>j</italic>
), which describes proximity (e.g., spatial or temporal) of any two items
<italic>i</italic>
and
<italic>j</italic>
as they are perceived within a single episode.
<italic>s
<sup>O</sup>
</italic>
(
<italic>i</italic>
,
<italic>j</italic>
) is not restricted to being symmetric, and can therefore be used to describe directed links such as temporal ordering. Each episode defines its own episodic association matrix. In the example of Figure
<xref ref-type="fig" rid="F1">1</xref>
A (taken from Caravaggio's “The calling of Matthew”), several entities make up the biblical episode (white dots). The graph in Figure
<xref ref-type="fig" rid="F1">1</xref>
B is a representation of the relationships between some entities: shorter edges correspond to stronger links. The representation takes into account the spatial layout in the painting but also other factors, reflecting processing of the scene by multiple cortical modules. These processing modules are not explicitly modeled here, and we only assume that the outcome of their computations can be summarized in the episodic memory module as pairwise associations. Jesus’ (1) hand (5) is represented as closer to Jesus than to Peter (2), because the observer can easily determine whose hand it is. Also, Matthew's (6) hand gesture (7) is in response to Jesus’ pointing finger (5) so that a strong link is assigned to the two, with a temporal order (represented by the arrow), which is accounted for in the
<italic>s</italic>
matrix (dropping the superscript
<italic>O</italic>
when unambiguous; Figure
<xref ref-type="fig" rid="F1">1</xref>
C) by the fact that
<italic>s</italic>
(5,7) > 
<italic>s</italic>
(7,5). The
<italic>s</italic>
matrix is limited to pairwise associations, but it already contains a great deal of information about the overall structure of the episode. One way to extract this structure is to perform hierarchical clustering (see also Ambros-Ingerson et al.,
<xref ref-type="bibr" rid="B1">1990</xref>
), based on the association matrix: pairs of strongly associated items are clustered together first, and pairs of clusters are fused at each step. Thus,
<italic>clustering trees</italic>
are formed (Figure
<xref ref-type="fig" rid="F1">1</xref>
D). We defined a procedure that assigns a probability to each tree (see Section
<xref ref-type="sec" rid="s2">2.5</xref>
), so that trees joining strongly associated items first are given a high probability. Importantly, valuable information is contained in clustering trees beyond the most probable one. For example, the association between Jesus (1) and his hand (5) is only contained in the second most probable tree, whereas the association between Jesus’ hand (5) and Matthew and his gesture are only captured by the eighth most probable tree. Each tree corresponds to an alternative explanation of the scene, and each adds to its description, so that it is advantageous to retain multiple trees, corresponding to multiple descriptions of the same scene. This procedure is controlled by the parameter
<italic>β</italic>
(see Section
<xref ref-type="sec" rid="s2">2.5</xref>
), which operates as a “softmax” (or temperature, in analogy to Boltzmann distributions), and determines how much probability weight is assigned to the most probable trees. A large value of
<italic>β</italic>
corresponds to only considering the most likely clustering, a low value to giving all trees similar probabilities.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption>
<p>
<bold>Relational representation of episodes and stochastic context-free grammars for semantic memory</bold>
.
<bold>(A)</bold>
In this episode, taken from the painting “The calling of Matthew” by Caravaggio (1571–1610) a biblical episode is portrayed: Jesus (to the right, number 1) enters a room, and points to Matthew (6) who makes a hand gesture (7) responding in surprise. The white dots indicate some of the main items in the scene.
<bold>(B)</bold>
A graph representation of the interrelationships between the items [numbers in
<bold>(A)</bold>
] as they can be deduced from the scene: items with a stronger association are displayed as closer to each other. The arrow indicates an association with a strong directional character (Matthew gesture – 7 – is in response, and temporally following Jesus’ gesture – 5).
<bold>(C)</bold>
Color-coded matrix representation of the associations displayed in
<bold>(B)</bold>
. Note that the matrix elements are not necessarily symmetric, in particular, the (5,7) element is larger than the (7,5) element, because of the directional relationship described in
<bold>(B)</bold>
.
<bold>(D)</bold>
Hierarchical clustering derived from the episodic association matrix of
<bold>(C)</bold>
(see Section
<xref ref-type="sec" rid="s1">2</xref>
). The 10 most likely trees are displayed, each with its assigned probability.
<bold>(E)</bold>
Scheme of a branching process: the 3-D matrix
<italic>a</italic>
(
<italic>i</italic>
,
<italic>j</italic>
,
<italic>k</italic>
) denotes the probability that node
<italic>i</italic>
may generate nodes
<italic>j</italic>
and
<italic>k</italic>
. The probability of a complete tree is the product of transition probability at all nodes.
<bold>(F)</bold>
Parsing tree of an English sentence: non-terminal nodes denote syntactic and grammatical components of the sentence (NP, noun phrase; PP, prepositional phrase; V, verb; N, noun; S, “sentence” or “start” node, at the root of the tree).</p>
</caption>
<graphic xlink:href="fncom-05-00036-g001"></graphic>
</fig>
<p>The activity of hippocampal neurons is well-suited to implement this relational code: During wakefulness, each entity (for example, each location) will elicit the activity of a cell assembly (a coherent group of cells), with the probability for co-activation of two cell assemblies as an increasing function of the association between the two encoded entities (McNaughton and Morris,
<xref ref-type="bibr" rid="B47">1987</xref>
). Thus, associational strength in the sense proposed here can be carried by coherent cell activity. For hippocampal place cells, for example, cells with overlapping place fields will have highly correlated activities. During sleep, the same activity correlations are reactivated (Wilson and McNaughton,
<xref ref-type="bibr" rid="B86">1994</xref>
).</p>
<p>It is tempting to speculate that episodic association matrices for several episodes can be stored by linear superimposition in the synaptic matrix of an auto-associative attractor network, as assumed in the Hopfield model (Hopfield,
<xref ref-type="bibr" rid="B28">1982</xref>
). In this way, episodes could be retrieved by pattern completion upon presentation of incomplete cues, and spontaneously activated (or replayed) independently. This has been suggested as a useful model of episodic memory (McNaughton and Morris,
<xref ref-type="bibr" rid="B47">1987</xref>
; McClelland et al.,
<xref ref-type="bibr" rid="B45">1995</xref>
; Shen and McNaughton,
<xref ref-type="bibr" rid="B67">1996</xref>
) and a candidate description of the function of the hippocampus, particularly with respect to subfield CA3, and its rich recurrent connectivity (Treves and Rolls,
<xref ref-type="bibr" rid="B79">1994</xref>
). Here, however, we will not model the dynamics of episodic memory explicitly, and we will just assume that the episodic module is capable of storing and retrieving these relational data.</p>
<p>As we will see below, the hierarchical clustering operation may be performed by activity initiated by hippocampal reactivation, and propagated through several stages of cortical modules. It will be taken as the starting point for the training of semantic memory.</p>
</sec>
<sec>
<label>2.2</label>
<title>Stochastic grammars for semantic memory</title>
<p>Semantic memory extracts regularities manifesting themselves in multiple distinct episodes (Quillian,
<xref ref-type="bibr" rid="B55">1968</xref>
; Rogers and McClelland,
<xref ref-type="bibr" rid="B60">2004</xref>
). In our framework, semantic memory is seen as a generative model of the world, based on the accumulation of experience. In Bayesian statistics, a generative model is a prescription to produce a probability for each possible episode, based on the previously acquired
<italic>corpus</italic>
of knowledge. The model can then be inverted using Bayes’ rule, to produce interpretations of further data. The model will assign a large probability to a likely episode (regardless of whether that particular episode was observed before), and smaller probabilities to episodes that do not fit the model's current experience of the world. Once the model has been trained on the acquired experience, the values of its parameters can be seen as a statistical description of regularities in the world, potentially of a very complex nature. After training, Bayesian inference can be used to analyze further episodes, to assess its most likely “causes,” or underlying relationships. If only partial evidence is available, Bayesian inference will also support pattern completion.</p>
<p>Simple models for semantic memory and consolidation (McClelland et al.,
<xref ref-type="bibr" rid="B45">1995</xref>
; Kali and Dayan,
<xref ref-type="bibr" rid="B32">2004</xref>
), have defined semantic knowledge in terms of pairwise associations between items. In fact, pairwise association can already provide rich representations of episodes, which can be embedded in a semantic system. For example, in Figure
<xref ref-type="fig" rid="F1">1</xref>
A, associating Jesus (1) and his hand (5) depends on having a model of the human body, while coupling Jesus’ and Matthew's gesture require Theory of Mind, and related models of gesture meaning. These complex cognitive operations, which require specific and extremely sophisticated models, well out of this work's scope, provide an input to the episodic memory module that we summarize here in a pairwise association matrix. Thus, we would like to formulate a generative model that assigns probabilities to each possible association matrix, and capable of capturing the highly structured and complex statistical regularities in the real world. We propose here a first step in this direction borrowing from Computational linguistics. This field has devised sophisticated generative models in the form of
<italic>stochastic grammars</italic>
(Manning and Schütze,
<xref ref-type="bibr" rid="B41">1999</xref>
), targeted at the analysis of language. For each sentence, stochastic grammars generate
<italic>parse trees</italic>
and assign to each a probability (Figure
<xref ref-type="fig" rid="F1">1</xref>
F). Parse trees are hierarchical groupings of sentence elements, where each group of words corresponds to a certain grammatical element. The resulting trees have
<italic>terminal</italic>
nodes, corresponding to the words in the sentence, and non-terminal nodes, which correspond to non-observed sentence constituents. A non-terminal node will encode, for example, the probability that a prepositional phrase (PP) is made up of a preposition (P: “of”) and a noun (N: “products”). These stochastic grammars can be trained (i.e., their parameters can be tuned) on a
<italic>corpus</italic>
of experienced utterances, in a supervised or unsupervised fashion.</p>
<p>The attractiveness of stochastic grammars is not limited to the linguistic realm (Bod,
<xref ref-type="bibr" rid="B6">2002</xref>
); to demonstrate how they can model semantic memory, and memory consolidation phenomena, we take into consideration a particular class of grammars, termed SCFG. In SCFGs, sentences are generated by a
<italic>branching process</italic>
, a stochastic process in which at each stage, a state
<italic>i</italic>
generates two further states,
<italic>j</italic>
and
<italic>k</italic>
with probability given by the transition matrix
<italic>p</italic>
(
<italic>i</italic>
 → 
<italic>j</italic>
,
<italic>k</italic>
) = 
<italic>a</italic>
(
<italic>i</italic>
,
<italic>j</italic>
,
<italic>k</italic>
) (Figure
<xref ref-type="fig" rid="F1">1</xref>
E). The process always starts from the start node
<italic>S</italic>
, and, in several stages, it produces a binary tree, with the words in the sentence generated as the terminal leaves of the tree, and the syntactical components of the sentence as non-terminal nodes. To this tree, a probability is assigned which is the product of the transition probabilities at each non-terminal node. Thus,
<italic>a</italic>
(
<italic>i</italic>
,
<italic>j</italic>
,
<italic>k</italic>
) represents all of the knowledge embedded in the grammar, that is all grammatical rules. Such knowledge can be extracted from a body of experienced sentences, through EM algorithms such as the
<italic>inside–outside</italic>
algorithm (Lari and Young,
<xref ref-type="bibr" rid="B37">1990</xref>
). This algorithm exploits a property of branching processes similar to the key property of Markov chains: if a tree is split at any given non-terminal node, the probability of the two sub-trees are independent, conditional on the identity of the node where the tree was split (Figure
<xref ref-type="fig" rid="F2">2</xref>
A). Because of this, probabilities can be computed from two independent terms, the first, the
<italic>inside</italic>
probability
<italic>e</italic>
(
<italic>i</italic>
,
<bold>K</bold>
), representing the probability that a certain non-terminal node
<italic>i</italic>
in the tree generates exactly the substring
<bold>K</bold>
(Figure
<xref ref-type="fig" rid="F2">2</xref>
B). The other term, the
<italic>outside</italic>
probability
<italic>f</italic>
(
<italic>i</italic>
,
<bold>K</bold>
), represents the probability that the non-terminal node
<italic>i</italic>
is generated in the process together with the complete sentence
<bold>S</bold>
minus the substring
<bold>K</bold>
. In the Expectation step (E-step) of the algorithm, the inside and outside probabilities are computed recursively (see Section
<xref ref-type="sec" rid="s4">2.8.1</xref>
) based on the current values of the transition matrix
<italic>a</italic>
(
<italic>i</italic>
,
<italic>j</italic>
,
<italic>k</italic>
) as computed from previous experience. The recursive algorithm highlights the respective contributions of bottom-up and top-down influences in determining these probabilities. In the maximization step (M-step), the
<italic>a</italic>
matrix is updated based on the value of the inside and outside probabilities.</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption>
<p>
<bold>Inside–outside algorithm</bold>
.
<bold>(A)</bold>
Markov-like property for trees in branching processes: if a tree
<italic>T</italic>
<sub>1</sub>
is split in two sub-trees
<italic>T</italic>
<sub>2</sub>
and
<italic>T</italic>
<sub>3</sub>
, the probability of the original tree is the product of the probabilities of the two sub-trees. Also, the two sub-trees are independent conditional to the value of the node at the separation point.
<bold>(B)</bold>
The probabilities in the two sub-trees can be computed separately: the inside probability can be computed recursively in a bottom-up fashion, the outside probabilities can be computed recursively top-down.</p>
</caption>
<graphic xlink:href="fncom-05-00036-g002"></graphic>
</fig>
<p>Thus, while trees are not the most general graph structure found in semantic data (Kemp and Tenenbaum,
<xref ref-type="bibr" rid="B33">2008</xref>
), they provide an especially simple and efficient way to implement learning (other graphical models, especially those containing loops, do not enjoy the same Markov-like properties, making EM approaches much more difficult), and thus are a suitable starting point for an investigation of memory processes for structured information. However, in order to make use of the set of tools from computational linguistics, we need to make a key modification: Stochastic grammars are generative models for sequences of symbols, or utterances. This has to be extended to more general structures, and here we propose how to define a SCFG that generates episodes in terms of association matrices. We want to use the relational data contained in the
<italic>s</italic>
matrix coding observed episodes to optimize the
<italic>a</italic>
(
<italic>i</italic>
,
<italic>j</italic>
,
<italic>k</italic>
) transition matrix. We wish to obtain a grammar that, on average, assigns large probabilities to the trees where pairs of items expected, based on experience, to have large associational strengths are closely clustered. This should hold for the data the grammar is trained on, but must also allow generalization to further data. For these reasons, we change the transition rule in the branching process as follows:</p>
<disp-formula id="E1">
<label>(1)</label>
<mml:math id="M1">
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo></mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mi>a</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mi>M</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>
<p>that is, the probability of node
<italic>i</italic>
generating nodes
<italic>j</italic>
and
<italic>k</italic>
is given by the
<italic>a</italic>
matrix (which we will call henceforth the
<italic>semantic</italic>
transition matrix, and reflects accumulated knowledge), multiplied by the set-wise association
<italic>M</italic>
(
<bold>P</bold>
,
<bold>Q</bold>
), a function of the current episode only, measuring the associational strengths between the subset
<bold>P</bold>
and
<bold>Q</bold>
, which in the tree are generated, respectively, by nodes
<italic>j</italic>
and
<italic>k</italic>
. Such an arrangement amplifies the contributions from pairs of sets that correspond to likely entities, which may be joined together. The term
<italic>M</italic>
(
<bold>P</bold>
,
<bold>Q</bold>
) is obtained from the episodic association matrix
<italic>s</italic>
(
<italic>i</italic>
,
<italic>j</italic>
) by means of a hierarchical clustering algorithm, and denotes the likelihood that a naive observer will single out the subsets
<bold>P</bold>
and
<bold>Q</bold>
when observing all the items in
<bold>P</bold>
 ∪ 
<bold>Q</bold>
and their interrelationships (see Section
<xref ref-type="sec" rid="s2">2.5</xref>
). Eq.
<xref ref-type="disp-formula" rid="E1">1</xref>
is the key component of a generative model, defining the probabilities of episodes, both in terms of their composition (the
<bold>O</bold>
<sup>(
<italic>n</italic>
)</sup>
) and the association matrix
<italic>s</italic>
, through the
<italic>M</italic>
(
<bold>P</bold>
,
<bold>Q</bold>
) function, as explained in Section
<xref ref-type="sec" rid="s2">2.5</xref>
.</p>
<p>With respect to the standard formulation of an SCFG (see, e.g., Lari and Young,
<xref ref-type="bibr" rid="B37">1990</xref>
; Manning and Schütze,
<xref ref-type="bibr" rid="B41">1999</xref>
) we made the further modification of eliminating the distinction between non-terminals and terminals, with unary transition probabilities between these two sorts of items. This further prescription is needed in linguistics, for example in order to categorize all nouns under a common type “noun,” but does not have an impact in the cases we will consider here. This abstract formulation has a possible parallel in cortical anatomy and physiology: for example, nodes at different levels in the parsing tree may correspond to modules at different levels in a cortical hierarchy (Felleman and Van Essen,
<xref ref-type="bibr" rid="B19">1991</xref>
), which could be implemented in more or less distributed modules, as described below. Training such a model may require a long time: The E-step entails computing a sum over a combinatorially large number of possible parse trees, as explained in Section
<xref ref-type="sec" rid="s3">2.8.</xref>
A crucial assumption here is that, in the brain, this calculation is performed by Monte-Carlo sampling: this may take place during the extended memory consolidation intervals following acquisition. Eq.
<xref ref-type="disp-formula" rid="E31">25</xref>
(see Section
<xref ref-type="sec" rid="s6">2.9</xref>
) defines an update rule allowing gradual optimization of the
<italic>a</italic>
matrix through successive presentations of the episodes. In the brain this could be implemented during sleep replay as follows: during each reactivation event (corresponding, e.g., to a hippocampal sharp wave, Kudrimoti et al.,
<xref ref-type="bibr" rid="B36">1999</xref>
), a subset of the encoded episode is reactivated in the hippocampus. The probability of the representations of two entities both being active in a reactivation event is a function of their episodic associational strength (Wilson and McNaughton,
<xref ref-type="bibr" rid="B86">1994</xref>
). The hippocampal input activates representations at multiple levels in the cortical hierarchy, corresponding to different levels in the parsing trees. At each level, the information relative to the episodic association matrix
<italic>s</italic>
(as well as
<italic>M</italic>
) can be computed from the probability that ascending inputs activate the corresponding units, so that perceptual data “percolate” in the cortical hierarchy.</p>
</sec>
<sec>
<label>2.3</label>
<title>Semantic networks, monte-carlo sampling, and cortical circuitry</title>
<p>Optimizing the
<italic>a</italic>
matrix is a very complex task, requiring the evaluation of several global quantities. However, it is possible to implement this optimization in an algorithm based on single module-based quantities, and with a dynamics inspired by the physiology of the sleeping neocortex. The learning rule in the consolidation algorithm acts on the node transition probabilities (see Section
<xref ref-type="sec" rid="s6">2.9</xref>
):</p>
<disp-formula id="E2">
<label>(2)</label>
<mml:math id="M2">
<mml:mrow>
<mml:mo>Δ</mml:mo>
<mml:mi>a</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo></mml:mo>
<mml:mfrac>
<mml:mo>η</mml:mo>
<mml:mrow>
<mml:mi>E</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mtext mathvariant="bold">O</mml:mtext>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mo>Γ</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mi>a</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo></mml:mo>
<mml:msub>
<mml:mo>Γ</mml:mo>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:msup>
<mml:mi>a</mml:mi>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where
<italic>η</italic>
is the learning rate.
<italic>E</italic>
(
<bold>O</bold>
<sup>(
<italic>n</italic>
)</sup>
), Γ
<italic>
<sub>i</sub>
</italic>
and Γ
<italic>
<sub>ijk</sub>
</italic>
are probability terms entering the Bayes formula (see Section
<xref ref-type="sec" rid="s5">2.8.2</xref>
):
<italic>E</italic>
(
<bold>O</bold>
<sup>(
<italic>n</italic>
)</sup>
) can be interpreted as the degree of familiarity of the episode
<bold>O</bold>
<sup>(
<italic>n</italic>
)</sup>
given the current state of the model
<italic>G</italic>
(see Section
<xref ref-type="sec" rid="s8">2.12</xref>
). Moreover,</p>
<disp-formula id="E3">
<mml:math id="M3">
<mml:mrow>
<mml:msub>
<mml:mo>Γ</mml:mo>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mtext>node </mml:mtext>
<mml:mi>i</mml:mi>
<mml:mtext> is used</mml:mtext>
<mml:mo>,</mml:mo>
<mml:mtext></mml:mtext>
<mml:mi>s</mml:mi>
<mml:mo></mml:mo>
<mml:msup>
<mml:mtext mathvariant="bold">O</mml:mtext>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>|</mml:mo>
<mml:mi>G</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>and</p>
<disp-formula id="E4">
<mml:math id="M4">
<mml:mrow>
<mml:msub>
<mml:mo>Γ</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo></mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext>node </mml:mtext>
<mml:mi>i</mml:mi>
<mml:mtext> is used</mml:mtext>
<mml:mo>,</mml:mo>
<mml:mtext></mml:mtext>
<mml:mi>s</mml:mi>
<mml:mo></mml:mo>
<mml:msup>
<mml:mtext mathvariant="bold">O</mml:mtext>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>|</mml:mo>
<mml:mi>G</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>(see Section
<xref ref-type="sec" rid="s5">2.8.2</xref>
for a complete derivation). Thus Γ
<italic>
<sub>i</sub>
</italic>
is the probability that node
<italic>i</italic>
enters the parsing tree somewhere, and that the tree's terminal nodes coincide with the items in the episode. Γ
<italic>
<sub>ijk</sub>
</italic>
is the probability that node
<italic>i</italic>
enters the tree and spawns nodes
<italic>j</italic>
and
<italic>k</italic>
, while generating the entire episode
<bold>O</bold>
<sup>(
<italic>n</italic>
)</sup>
as the terminals in the tree. These terms can be computed recursively through the inside–outside probabilities, which are very convenient for computer calculations. However, these terms can also be directly computed as a sum of probabilities over trees</p>
<disp-formula id="E5">
<label>(3)</label>
<mml:math id="M5">
<mml:mrow>
<mml:msub>
<mml:mo>Γ</mml:mo>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mstyle displaystyle="true">
<mml:mrow>
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo></mml:mo>
<mml:mtext>trees including node </mml:mtext>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:math>
</disp-formula>
<p>and</p>
<disp-formula id="E6">
<label>(4)</label>
<mml:math id="M6">
<mml:mrow>
<mml:msub>
<mml:mo>Γ</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mstyle displaystyle="true">
<mml:mrow>
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo></mml:mo>
<mml:mtext>trees including node </mml:mtext>
<mml:mi>i</mml:mi>
<mml:mtext> and </mml:mtext>
<mml:mi>i</mml:mi>
<mml:mo></mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where for each tree
<italic>t</italic>
the probability
<italic>p</italic>
(
<italic>t</italic>
) may be computed as a product of the transition probabilities from Eq.
<xref ref-type="disp-formula" rid="E1">1</xref>
at all nodes. Using Eqs
<xref ref-type="disp-formula" rid="E5">3</xref>
and
<xref ref-type="disp-formula" rid="E6">4</xref>
in a computer simulation may be very inefficient. However, cortical circuitries may well perform these computations during sleep replay. To see this, let us write
<italic>p</italic>
(
<italic>t</italic>
) from Eqs
<xref ref-type="disp-formula" rid="E45">36</xref>
and
<xref ref-type="disp-formula" rid="E46">37</xref>
as the product</p>
<disp-formula id="E7">
<label>(5)</label>
<mml:math id="M7">
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mi>E</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mi>Y</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where</p>
<disp-formula id="E8">
<label>(6)</label>
<mml:math id="M8">
<mml:mrow>
<mml:mi>E</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mstyle displaystyle="true">
<mml:mrow>
<mml:munder>
<mml:mo></mml:mo>
<mml:mi>j</mml:mi>
</mml:munder>
<mml:mrow>
<mml:msup>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo>β</mml:mo>
<mml:mi>s</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where the product index
<italic>j</italic>
runs over all non-terminal nodes in tree
<italic>t</italic>
and
<bold>P</bold>
and
<bold>Q</bold>
are the subsets of terminal nodes (episode items) indirectly generated by the two children of node
<italic>j</italic>
.
<italic>s</italic>
(
<bold>P</bold>
,
<bold>Q</bold>
) is a simply the average association strength between all items in
<bold>P</bold>
and all items in
<bold>Q</bold>
. Similarly,</p>
<disp-formula id="E9">
<label>(7)</label>
<mml:math id="M9">
<mml:mrow>
<mml:mi>Y</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mstyle displaystyle="true">
<mml:mrow>
<mml:munder>
<mml:mo></mml:mo>
<mml:mi>j</mml:mi>
</mml:munder>
<mml:mrow>
<mml:mi>a</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where
<italic>c</italic>
<sub>1</sub>
(
<italic>j</italic>
) and
<italic>c</italic>
<sub>2</sub>
(
<italic>j</italic>
) are the children of node
<italic>j</italic>
in tree
<italic>t</italic>
. Thus,
<italic>E</italic>
(
<italic>t</italic>
) depends only on the episodic transition strengths
<italic>s</italic>
. For this reason, we will name it the episodic strength of tree
<italic>t</italic>
. Likewise,
<italic>Y</italic>
(
<italic>t</italic>
) depends only on the semantic transition matrix
<italic>a</italic>
. Therefore we will call it the semantic strength of tree
<italic>t</italic>
. These quantities may have a neural interpretation: the semantic network (i.e., the neocortex) can be seen as a set of repeated modules, which may correspond for example to a cortical column. Each of these modules is composed of an input layer consisting of coincidence detectors (corresponding to single cells or cell groups), each triggered by the co-activation of a pair of inputs (Figure
<xref ref-type="fig" rid="F3">3</xref>
A). This layer projects to an output layer, which can propagate activity to downstream modules, via a set of plastic connections, which represent the transition probabilities
<italic>a</italic>
(
<italic>i</italic>
,
<italic>j</italic>
,
<italic>k</italic>
). These modules are organized in a multi-layer hierarchy, in which each module sends inputs to all modules higher up (Figure
<xref ref-type="fig" rid="F3">3</xref>
B), and reflecting sites spanning the entire cerebral cortex. At the base of this hierarchy sits the storage module for episodic memory, the hippocampus.</p>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption>
<p>
<bold>Possible implementation of the model in the brain</bold>
.
<bold>(A)</bold>
A cortical module (roughly corresponding to a column), composed by a coincidence detection layer and an output layer, with modifiable connections between them.
<bold>(B)</bold>
A hierarchy of such modules, with the hippocampus at the bottom of the hierarchy.
<bold>(C)</bold>
Scheme of activation probabilities and amplitudes for a parsing tree.</p>
</caption>
<graphic xlink:href="fncom-05-00036-g003"></graphic>
</fig>
<p>These theoretical assumptions find possible counterparts in experimental data: from the dynamic point of view, hippocampal sharp waves may loosely correspond to reactivation events (Kudrimoti et al.,
<xref ref-type="bibr" rid="B36">1999</xref>
). At each event, hippocampal cell assemblies, one for items 1–4 in the episode of Figure
<xref ref-type="fig" rid="F3">3</xref>
C, are activated randomly, and activities propagate in the cortical hierarchy. At each cortical node, the probability of activating a coincidence detector is an increasing function of the probability of co-activation of the two groups of hippocampal units sending, through multiple layers, input to the two sides of the detector. Thus, the factors
<italic>e</italic>
<sup>β
<italic>s</italic>
(
<bold>P</bold>
,
<bold>Q</bold>
)</sup>
making up
<italic>E</italic>
(
<italic>t</italic>
) can be approximately computed at each level in the hierarchy. In the rat hippocampus, for example, this co-activation probability during sleep contains information about co-activations expressed during experience acquisition (Wilson and McNaughton,
<xref ref-type="bibr" rid="B86">1994</xref>
), so that it may carry the episodic association signal defined in our theory. The activation of each cortical module will be determined by the timing of its afferent inputs. Let us assume that module 5 is activated by inputs 1 and 2 and module 6 by inputs 3 and 4. Then, a downstream module 7, receiving inputs from modules 5 and 6 will be in the position of computing the probability of co-activation of the two sets of hippocampal units (1,2) and (3,4). In this way, all terms of the form
<italic>e</italic>
<sup>β
<italic>s</italic>
(
<bold>P</bold>
,
<bold>Q</bold>
)</sup>
entering Eq.
<xref ref-type="disp-formula" rid="E8">6</xref>
may be computed. Across multiple reactivation events, this neural dynamics is thus equivalent to a Monte-Carlo sampling of the semantic transition probabilities through the semantic strength (Eq.
<xref ref-type="disp-formula" rid="E9">7</xref>
) with a probability distribution given by the episodic strength
<italic>E</italic>
(
<italic>t</italic>
), ultimately yielding an estimate of the tree probability
<italic>p</italic>
(
<italic>t</italic>
), through Eq.
<xref ref-type="disp-formula" rid="E7">5</xref>
. If a tree is activated by a reactivation event, the activity level in the units making up the tree is given by the semantic strengths (Eq.
<xref ref-type="disp-formula" rid="E9">7</xref>
). This amplitude is a product of transition probabilities at all nodes in the trees. For each tree activation, this can be computed, in a bottom-up fashion, at the start node (the most downstream node in the hierarchy). It may then be communicated to lower nodes by top-down feedback connections, from higher-order (frontal) cortical areas to lower order sensory, upstream areas, reflecting top-down influences from frontal cortices. Eq.
<xref ref-type="disp-formula" rid="E2">2</xref>
has a further
<italic>E</italic>
(
<bold>O</bold>
<sup>(
<italic>n</italic>
)</sup>
) term in the denominator, which lends itself to an interesting interpretation: this term is proportional to the general familiarity of the current episode (see Section
<xref ref-type="sec" rid="s8">2.12</xref>
). Thus, plasticity is suppressed for familiar episodes, and enhanced for novel ones, which have a larger impact on learning. This type of filtering is similar to the role assigned to cholinergic neuromodulation by theories of novelty-based gating of learning (Yu and Dayan,
<xref ref-type="bibr" rid="B89">2005</xref>
). It is interesting to note that the term
<italic>E</italic>
(
<bold>O</bold>
<sup>(
<italic>n</italic>
)</sup>
) is computed at the top of the tree, which in a cortical hierarchy would correspond to the prefrontal cortex, harboring the cortical areas which exert the strongest control over neuromodulatory structures (Mesulam and Mufson,
<xref ref-type="bibr" rid="B48">1984</xref>
; Zaborszky et al.,
<xref ref-type="bibr" rid="B90">1997</xref>
), and have been implicated in novelty assessment (see, e.g., Ljungberg et al.,
<xref ref-type="bibr" rid="B40">1992</xref>
).</p>
<p>Last, connections representing the semantic transition matrix are modified according to the rule of Eq.
<xref ref-type="disp-formula" rid="E2">2</xref>
: at each reactivation event, only synapses in cortical modules recruited in the activated tree are modified. The two terms on the right hand side of Eq.
<xref ref-type="disp-formula" rid="E2">2</xref>
can be seen as giving rise to two plasticity processes: connections from the activated coincidence detector to the output layer are incremented by a factor
<italic>ηY</italic>
(
<italic>t</italic>
)
<italic>a</italic>
(
<italic>i</italic>
,
<italic>j</italic>
,
<italic>k</italic>
) (analogous to long-term potentiation), and synapses from all coincident detectors to the module's output layer are decreased by a factor
<italic>ηY</italic>
(
<italic>T</italic>
)(
<italic>a</italic>
(
<italic>i</italic>
,
<italic>j</italic>
,
<italic>k</italic>
))
<sup>2</sup>
, similar to long-term depression.</p>
</sec>
<sec>
<label>2.4</label>
<title>Definitions and notation</title>
<p>Data are supplied to the model as a sequence of distinct observations or
<italic>episodes</italic>
, with the
<italic>n</italic>
-th episode characterized by a set of
<italic>N</italic>
observed objects
<inline-formula>
<mml:math id="M10">
<mml:mrow>
<mml:msup>
<mml:mtext mathvariant="bold">O</mml:mtext>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo></mml:mo>
<mml:mo>{</mml:mo>
<mml:msubsup>
<mml:mi>o</mml:mi>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:mo></mml:mo>
<mml:mo>,</mml:mo>
<mml:msubsup>
<mml:mi>o</mml:mi>
<mml:mi>N</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mo>}</mml:mo>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>
which constitute the terminal nodes of the parsing tree, and by an episodic association matrix
<inline-formula>
<mml:math id="M11">
<mml:mrow>
<mml:msubsup>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>
<inline-formula>
<mml:math id="M12">
<mml:mrow>
<mml:mn>0</mml:mn>
<mml:mo></mml:mo>
<mml:msubsup>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mo></mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mo></mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
(the superscript
<italic>n</italic>
will be dropped whenever evident from context), which reflects the degree of associations between pairs of objects as they are perceived in that particular observation. The
<italic>s</italic>
<sup>(
<italic>n</italic>
)</sup>
matrix is supposed to be computed and stored in the episodic memory module; it encodes, for example, spatial and temporal proximity. Temporal ordering can be embedded in the representation by assuming that
<inline-formula>
<mml:math id="M13">
<mml:mrow>
<mml:msubsup>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mo></mml:mo>
<mml:msubsup>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mo>.</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>
By convention, if
<inline-formula>
<mml:math id="M14">
<mml:mrow>
<mml:msubsup>
<mml:mi>o</mml:mi>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula>
temporally follows
<inline-formula>
<mml:math id="M15">
<mml:mrow>
<mml:msubsup>
<mml:mi>o</mml:mi>
<mml:mi>j</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>
then
<inline-formula>
<mml:math id="M16">
<mml:mrow>
<mml:msubsup>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mo>></mml:mo>
<mml:msubsup>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mo>.</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>
Because the representation of each object has already been processed by the semantic modules at the moment of perception, the episodic association matrix will also reflect, indirectly, associational biases already present in the cortex.</p>
<p>We will say that subset
<bold>K</bold>
 ⊂ 
<bold>O</bold>
<sup>(
<italic>n</italic>
)</sup>
is
<italic>generated</italic>
in a parse tree if all and only the observables
<italic>o</italic>
 ∈ 
<bold>K</bold>
are the leaves of one of its sub-trees. Each parse tree will be assigned a probability equal to the product of the probability of each node. The probability of each node (say, a node in which node
<italic>i</italic>
generates, as children, nodes
<italic>j</italic>
and
<italic>k</italic>
) will be in turn, the product of two terms (Eq.
<xref ref-type="disp-formula" rid="E1">1</xref>
): one, originating from the episodic information, reflecting the episodic association between the subsets
<bold>P</bold>
and
<bold>Q</bold>
, generated in the parse, respectively, by nodes
<italic>j</italic>
and
<italic>k</italic>
. This will be given by the function
<italic>M</italic>
(
<bold>P</bold>
,
<bold>Q</bold>
), defined below. This term represents a major difference with respect to the original definition of SCFGs. The second term comes from the semantic module, and, like in a regular SCFG, reflects the probability that the two nodes
<italic>j</italic>
,
<italic>k</italic>
are generated by the parent
<italic>i</italic>
, given that parent
<italic>i</italic>
is used in the parsing. This latter probability represents the model's “belief” about the underlying causes of the current episode, and the consequent parsing. This is given by the semantic transition matrix
<italic>a</italic>
(
<italic>i</italic>
,
<italic>j</italic>
,
<italic>k</italic>
), which is learned from experience as explained in the next section.</p>
</sec>
<sec id="s2">
<label>2.5</label>
<title>Calculation of set-wise association
<italic>M</italic>
<sup>(
<italic>n</italic>
)</sup>
(P, Q)</title>
<p>In order to extract categories and concepts from an episode at all orders of complexity, it is necessary to evaluate the episodic associations not only between pairs of single items, but also between pairs of item subsets
<bold>I</bold>
and
<bold>J</bold>
(each subset potentially corresponding to a higher-order concept). We term the matrix containing such associations
<italic>M</italic>
<sup>(
<italic>n</italic>
)</sup>
. This matrix may be defined by means of a pairwise hierarchical clustering, based on the episodic association between terminal nodes for
<bold>I</bold>
,
<bold>J</bold>
 ⊂ 
<bold>O</bold>
<sup>(
<italic>n</italic>
)</sup>
,
<bold>I</bold>
 ∩ 
<bold>J</bold>
 = ø</p>
<disp-formula id="E10">
<label>(8)</label>
<mml:math id="M17">
<mml:mrow>
<mml:msup>
<mml:mi>M</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo stretchy="false">(</mml:mo>
<mml:mtext mathvariant="bold">I</mml:mtext>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">J</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mi>p</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mtext mathvariant="bold">I</mml:mtext>
<mml:mo></mml:mo>
<mml:mtext>J is split into I</mml:mtext>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">J</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>
<p>Thus,
<italic>M</italic>
<sup>(
<italic>n</italic>
)</sup>
(
<bold>I</bold>
,
<bold>J</bold>
) quantifies the probability of recognizing
<bold>I</bold>
and
<bold>J</bold>
as coherent entities, when all the items in
<bold>I</bold>
 ∪ 
<bold>J</bold>
are presented.
<italic>M</italic>
<sup>(
<italic>n</italic>
)</sup>
(
<bold>I</bold>
,
<bold>J</bold>
) may be generated as follows.</p>
<list list-type="order">
<list-item>
<p>For the
<italic>n</italic>
-th episode, generate the set
<italic>T</italic>
<sup>0</sup>
(
<bold>O</bold>
<sup>(
<italic>n</italic>
)</sup>
) of all possible binary trees with
<inline-formula>
<mml:math id="M18">
<mml:mrow>
<mml:msubsup>
<mml:mi>o</mml:mi>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:mo></mml:mo>
<mml:mo>,</mml:mo>
<mml:msubsup>
<mml:mi>o</mml:mi>
<mml:mi>N</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula>
as the (ordered) labeled terminal nodes. In a computer simulation this can be done by following the procedure devised by Rohlf (
<xref ref-type="bibr" rid="B61">1983</xref>
), augmented to generate all possible orderings of nodes.</p>
</list-item>
<list-item>
<p>For each tree
<italic>t</italic>
 ∈ 
<italic>T</italic>
<sup>0</sup>
(
<bold>O</bold>
<sup>(
<italic>n</italic>
)</sup>
), compute a global episodic association strength
<italic>S</italic>
(
<italic>t</italic>
) with the following algorithm</p>
<list list-type="simple">
<list-item>
<label>(a)</label>
<p>set
<italic>S</italic>
(
<italic>t</italic>
) = 1</p>
</list-item>
<list-item>
<label>(b)</label>
<p>for each terminal node
<inline-formula>
<mml:math id="M19">
<mml:mrow>
<mml:msubsup>
<mml:mi>o</mml:mi>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula>
, set
<inline-formula>
<mml:math id="M20">
<mml:mrow>
<mml:mi>L</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:msubsup>
<mml:mi>o</mml:mi>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mo>{</mml:mo>
<mml:msubsup>
<mml:mi>o</mml:mi>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mo>}</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>
(the set only composed by the element
<inline-formula>
<mml:math id="M21">
<mml:mrow>
<mml:msubsup>
<mml:mi>o</mml:mi>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula>
), where the function
<italic>L</italic>
(
<italic>i</italic>
) denotes the set of terminals generated by the node
<italic>i</italic>
in tree
<italic>t</italic>
</p>
</list-item>
<list-item>
<label>(c)</label>
<p>find the bottom-left-most node
<italic>n</italic>
which has two leaves
<italic>p</italic>
and
<italic>q</italic>
as children, eliminate
<italic>p</italic>
and
<italic>q</italic>
and substitute
<italic>n</italic>
with new terminal node
<italic>z</italic>
</p>
</list-item>
<list-item>
<label>(d)</label>
<p>set
<inline-formula>
<mml:math id="M22">
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo></mml:mo>
<mml:msup>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo>β</mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mi>q</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula>
</p>
</list-item>
<list-item>
<label>(e)</label>
<p>set
<italic>L</italic>
(
<italic>z</italic>
) = 
<italic>L</italic>
(
<italic>p</italic>
) ∪ 
<italic>L</italic>
(
<italic>q</italic>
)</p>
</list-item>
<list-item>
<label>(f)</label>
<p>generate the associations between
<italic>z</italic>
and all other terminal nodes
<italic>i</italic>
with the formula</p>
<p>
<disp-formula id="E11">
<mml:math id="M23">
<mml:mrow>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>z</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>L</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>,</mml:mo>
<mml:mi>L</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
<p>where
<inline-formula>
<mml:math id="M24">
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mstyle scriptlevel="+1">
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:mo>#</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo>#</mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo></mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo>,</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo></mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>
and #
<bold>P</bold>
is the cardinality of
<bold>P</bold>
.</p>
</list-item>
<list-item>
<label>(g)</label>
<p>go back to (c) until there is a single node</p>
</list-item>
</list>
<p>It is easy to demonstrate the following important</p>
<p>
<bold>Property</bold>
: Let
<italic>t</italic>
 ∈ 
<italic>T</italic>
<sup>0</sup>
(
<bold>O</bold>
<sup>(
<italic>n</italic>
)</sup>
), that is, one of the trees that generate
<bold>O</bold>
<sup>(
<italic>n</italic>
)</sup>
. Let
<italic>u</italic>
and
<italic>v</italic>
be two sub-trees such that
<italic>t</italic>
 = 
<italic>u</italic>
°
<italic>v</italic>
(that is,
<italic>t</italic>
is composed by substituting the root of
<italic>v</italic>
to a terminal node
<italic>i</italic>
of
<italic>u</italic>
). Let
<bold>V</bold>
be such that
<italic>v</italic>
 ∈ 
<italic>T</italic>
<sup>0</sup>
(
<bold>V</bold>
) and
<italic>u</italic>
 ∈ 
<italic>T</italic>
<sup>1</sup>
(
<bold>O</bold>
<sup>(n)</sup>
\V), that is the set of all sub-trees having the elements of
<bold>O</bold>
<sup>(n)</sup>
\V, plus one extra “free” node
<italic>i</italic>
as terminals. Then</p>
<p>
<disp-formula id="E12">
<label>(9)</label>
<mml:math id="M25">
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>u</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>v</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:msup>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo></mml:mo>
<mml:mo>β</mml:mo>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mtext mathvariant="bold">v</mml:mtext>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mtext mathvariant="bold">o</mml:mtext>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>\</mml:mo>
<mml:mtext mathvariant="bold">v</mml:mtext>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
</list-item>
<list-item>
<p>set the
<italic>observational</italic>
probability of tree
<italic>T</italic>
based on the prescription:</p>
<p>
<disp-formula id="E13">
<label>(10)</label>
<mml:math id="M26">
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:mi>T</mml:mi>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:msup>
<mml:mi>t</mml:mi>
<mml:mo></mml:mo>
</mml:msup>
<mml:mo></mml:mo>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mn>0</mml:mn>
</mml:msup>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mtext mathvariant="bold">O</mml:mtext>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:msup>
<mml:mi>t</mml:mi>
<mml:mo></mml:mo>
</mml:msup>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
</list-item>
<list-item>
<p>set</p>
</list-item>
</list>
<disp-formula id="E14">
<label>(11)</label>
<mml:math id="M27">
<mml:mrow>
<mml:mi>M</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo></mml:mo>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mn>0</mml:mn>
</mml:msup>
<mml:mo stretchy="false">(</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo></mml:mo>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mn>0</mml:mn>
</mml:msup>
<mml:mo stretchy="false">(</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo></mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mfrac>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where
<italic>T</italic>
(
<bold>P</bold>
,
<bold>Q</bold>
) is the set of all trees in which
<bold>P</bold>
 ∪ 
<bold>Q</bold>
is split into
<bold>P</bold>
and
<bold>Q</bold>
.</p>
<p>
<italic>M</italic>
(
<bold>P</bold>
,
<bold>Q</bold>
) can be interpreted as the probability that an observer having only access to the episodic information would split the set
<bold>P</bold>
 ∪ 
<bold>Q</bold>
into
<bold>P</bold>
and
<bold>Q</bold>
. Note how the parameter
<italic>β</italic>
performs a “softmax” operation of sorts: a large value of
<italic>β</italic>
will concentrate all the probability weight on the most probable tree, a lower value will distribute the weight more evenly.</p>
</sec>
<sec>
<label>2.6</label>
<title>Asymmetric associations</title>
<p>In order to encode temporal order, it is necessary to have asymmetric associations strengths
<italic>s
<sub>ij</sub>
</italic>
. We have chosen here the form (Figure
<xref ref-type="fig" rid="F6">6</xref>
B):</p>
<disp-formula id="E15">
<label>(12)</label>
<mml:math id="M28">
<mml:mrow>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo>{</mml:mo>
<mml:mrow>
<mml:mtable columnalign="left">
<mml:mtr columnalign="left">
<mml:mtd columnalign="left">
<mml:mrow>
<mml:msup>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo></mml:mo>
<mml:msub>
<mml:mo>λ</mml:mo>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo></mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>></mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr columnalign="left">
<mml:mtd columnalign="left">
<mml:mrow>
<mml:msup>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo></mml:mo>
<mml:msub>
<mml:mo>λ</mml:mo>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo></mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mtd>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo><</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mrow>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where
<italic>t</italic>
(
<italic>i</italic>
) is the time of occurrence of item
<italic>i</italic>
. In the simulations for Figures
<xref ref-type="fig" rid="F6">6</xref>
and
<xref ref-type="fig" rid="F7">7</xref>
,
<italic>λ</italic>
<sub>1</sub>
 = 5 and
<italic>λ</italic>
<sub>2</sub>
 = 1.5 were used, ensuring that associations were larger from preceding to subsequent items than vice versa.</p>
</sec>
<sec>
<label>2.7</label>
<title>Full generative model</title>
<p>The branching process (Eq.
<xref ref-type="disp-formula" rid="E1">1</xref>
) gives a prescription on how to generate episodes. The association matrix enters the transition probabilities only trough the
<italic>M</italic>
(
<bold>P</bold>
,
<bold>Q</bold>
) functions. Thus the generative model can be expressed as follows:</p>
<list list-type="order">
<list-item>
<p>generate a parsing tree
<italic>t</italic>
with probability</p>
<p>
<disp-formula id="E16">
<mml:math id="M29">
<mml:mrow>
<mml:msub>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>e</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mstyle displaystyle="true">
<mml:mrow>
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo></mml:mo>
<mml:mi>N</mml:mi>
<mml:mi>T</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mi>a</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
<p>where
<italic>NT</italic>
(
<italic>t</italic>
) is the set of nodes in tree
<italic>t</italic>
,
<italic>c</italic>
<sub>1,2</sub>
(
<italic>i</italic>
,
<italic>t</italic>
) are the two children of node
<italic>i</italic>
.</p>
</list-item>
<list-item>
<p>draw the association matrix
<italic>S
<sup>n</sup>
</italic>
according to the distribution</p>
<p>
<disp-formula id="E17">
<mml:math id="M30">
<mml:mrow>
<mml:msub>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mrow>
<mml:mi>a</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>c</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mi>S</mml:mi>
<mml:mi>n</mml:mi>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mtext mathvariant="bold">Z</mml:mtext>
</mml:mfrac>
<mml:mstyle displaystyle="true">
<mml:mrow>
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo></mml:mo>
<mml:mi>N</mml:mi>
<mml:mi>T</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mi>M</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mtext mathvariant="bold">C</mml:mtext>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">C</mml:mtext>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
<mml:mo>.</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
<p>where
<bold>C</bold>
(
<italic>i</italic>
) is the subset of the episode spanned by the node
<italic>i</italic>
, and the constant Z ensures normalization of
<italic>P
<sub>assoc</sub>
</italic>
.</p>
</list-item>
</list>
</sec>
<sec id="s3">
<label>2.8</label>
<title>Generalized inside–outside algorithm for extraction of semantic information</title>
<p>The extraction of the semantic transition matrix
<italic>a</italic>
(
<italic>i</italic>
,
<italic>j</italic>
,
<italic>k</italic>
) from experience is performed by means of a generalized inside–outside algorithm (Lari and Young,
<xref ref-type="bibr" rid="B37">1990</xref>
). Inside–outside is the branching process equivalent to the forward-backward algorithm used to train Hidden Markov Models (Rabiner,
<xref ref-type="bibr" rid="B56">1989</xref>
). The main difference between the algorithm presented here and the algorithm by Lari and Young (
<xref ref-type="bibr" rid="B37">1990</xref>
) is the fact that here we deal with data in which interrelationships are more complex than what may be captured by sequential ordering. Rather, we need to rely on the associations encoded by the episodic module to figure out which nodes can be parsed as siblings.</p>
<p>Similarly to Lari and Young (
<xref ref-type="bibr" rid="B37">1990</xref>
), we assume that each episode
<bold>O</bold>
<sup>(
<italic>n</italic>
)</sup>
is generated by a tree having as root the start symbol
<italic>S</italic>
.</p>
<p>In SCFGs the matrix
<italic>a</italic>
(
<italic>i</italic>
,
<italic>j</italic>
,
<italic>k</italic>
) is defined as:</p>
<disp-formula id="E18">
<label>(13)</label>
<mml:math id="M31">
<mml:mrow>
<mml:mi>a</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mi>p</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo></mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo>|</mml:mo>
<mml:mi>i</mml:mi>
<mml:mtext> is used in the parse tree</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>
<p>with</p>
<disp-formula id="E19">
<label>(14)</label>
<mml:math id="M32">
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:mrow>
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mi>a</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
<mml:mtext>  </mml:mtext>
<mml:mo></mml:mo>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:math>
</disp-formula>
<p>We modified this rule, according to Eq.
<xref ref-type="disp-formula" rid="E1">1</xref>
:</p>
<disp-formula id="E20">
<mml:math id="M33">
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo></mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mi>a</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mi>M</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where
<bold>P</bold>
and
<bold>Q</bold>
are the sets of terminals descending from
<italic>i</italic>
and
<italic>j</italic>
respectively. The matrix
<italic>a</italic>
represents the generative model of the world constituting the semantic memory, and we will indicate it by the letter
<italic>G</italic>
.</p>
<sec id="s4">
<label>2.8.1</label>
<title>The E-step</title>
<p>In the E-step of an EM algorithm, the probabilities of the observed data are evaluated based on the current value of the hidden parameters in the model, in this case the
<italic>a</italic>
(
<italic>i</italic>
,
<italic>j</italic>
,
<italic>k</italic>
) matrix. To do so, the generalized inside–outside algorithm defines the
<italic>inside</italic>
probabilities (Figure
<xref ref-type="fig" rid="F2">2</xref>
B), for the
<italic>n</italic>
-th episode:</p>
<disp-formula id="E21">
<label>(15)</label>
<mml:math id="M34">
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:msup>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo></mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi>G</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mo>=</mml:mo>
<mml:mstyle displaystyle="true">
<mml:mrow>
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:mrow>
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo></mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mi>a</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:msup>
<mml:mi>M</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo stretchy="false">(</mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo>\</mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
<mml:mi>e</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
<mml:mi>e</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo>\</mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
<mml:mo>.</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</disp-formula>
<p>For sets
<bold>P</bold>
of cardinality 1:</p>
<disp-formula id="E22">
<label>(16)</label>
<mml:math id="M35">
<mml:mtable columnalign="left">
<mml:mtr>
<mml:mtd>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mo>{</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>}</mml:mo>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mi>e</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mn>0</mml:mn>
<mml:mtext> P</mml:mtext>
<mml:mo></mml:mo>
<mml:mo>{</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>}</mml:mo>
<mml:mtext></mml:mtext>
<mml:mo></mml:mo>
<mml:mi>i</mml:mi>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</disp-formula>
<p>
<italic>e</italic>
(
<italic>i</italic>
,
<bold>P</bold>
) is the probability that
<italic>i</italic>
generates the subset
<bold>P</bold>
of the episode
<bold>O</bold>
<sup>(
<italic>n</italic>
)</sup>
. Note that the
<italic>e</italic>
(
<italic>i</italic>
,
<bold>P</bold>
) can be computed recursively in a “bottom-up” fashion, from the inside probabilities for the subsets of
<bold>P</bold>
through Eq.
<xref ref-type="disp-formula" rid="E21">15</xref>
, with the starting condition in Eq.
<xref ref-type="disp-formula" rid="E22">16</xref>
.</p>
<p>The
<italic>outside</italic>
probabilities
<italic>f</italic>
(
<italic>i</italic>
,
<bold>P</bold>
,) are defined as:</p>
<disp-formula id="E23">
<label>(17)</label>
<mml:math id="M36">
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mi>f</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mi>P</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mtext mathvariant="bold">O</mml:mtext>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>\</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo></mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>|</mml:mo>
<mml:mi>G</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mo>=</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mtext>Q:P</mml:mtext>
<mml:mo></mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo></mml:mo>
<mml:msup>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>a</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:msup>
<mml:mi>M</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo stretchy="false">(</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo>\</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
<mml:mi>f</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
<mml:mi>e</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo>\</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mstyle>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mrow>
<mml:mo>+</mml:mo>
<mml:mi>a</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:msup>
<mml:mi>M</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo stretchy="false">(</mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo>\</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
<mml:mi>f</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
<mml:mi>e</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo>\</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</disp-formula>
<p>with the condition</p>
<disp-formula id="E24">
<label>(18)</label>
<mml:math id="M37">
<mml:mrow>
<mml:mi>f</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mtext mathvariant="bold">O</mml:mtext>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where
<italic>S</italic>
is the start symbol that will be at the root of all parse trees.
<italic>f</italic>
(
<italic>i</italic>
,
<bold>P</bold>
) are the probabilities that the start symbol
<italic>S</italic>
generates everything but the set
<bold>P</bold>
,
<italic>plus</italic>
the symbol
<italic>i</italic>
(Figure
<xref ref-type="fig" rid="F2">2</xref>
B). Note that, once the inside probabilities are computed, the outside probabilities can be computed recursively “top-down” from the outside probabilities for supersets of
<bold>P</bold>
, with the starting condition in Eq.
<xref ref-type="disp-formula" rid="E24">18</xref>
.</p>
</sec>
<sec id="s5">
<label>2.8.2</label>
<title>The M-step</title>
<p>Once the
<italic>e</italic>
and
<italic>f</italic>
probabilities are computed, the M-step, that is, the optimization of the semantic probabilities
<italic>a</italic>
(
<italic>i</italic>
,
<italic>j</italic>
,
<italic>k</italic>
) can be performed as follows.</p>
<p>First, note that</p>
<disp-formula id="E25">
<label>(19)</label>
<mml:math id="M38">
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mi>e</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
<mml:mi>f</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mo></mml:mo>
<mml:msup>
<mml:mtext mathvariant="bold">O</mml:mtext>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo></mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo>|</mml:mo>
<mml:mi>G</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mo>=</mml:mo>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mo></mml:mo>
<mml:msup>
<mml:mtext mathvariant="bold">O</mml:mtext>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo></mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo>|</mml:mo>
<mml:mi>S</mml:mi>
<mml:mo></mml:mo>
<mml:msup>
<mml:mtext mathvariant="bold">O</mml:mtext>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>,</mml:mo>
<mml:mi>G</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</disp-formula>
<p>Let
<italic>E</italic>
(
<bold>O</bold>
<sup>(
<italic>n</italic>
)</sup>
) = 
<italic>P</italic>
(
<italic>s</italic>
 → 
<bold>O</bold>
<sup>(
<italic>n</italic>
)</sup>
|
<italic>G</italic>
) = 
<italic>e</italic>
(
<italic>S</italic>
,
<bold>O</bold>
<sup>(
<italic>n</italic>
)</sup>
)</p>
<p>We have</p>
<disp-formula id="E26">
<label>(20)</label>
<mml:math id="M39">
<mml:mtable columnalign="right">
<mml:mtr>
<mml:mtd>
<mml:msub>
<mml:mo>Γ</mml:mo>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:msup>
<mml:mtext mathvariant="bold">O</mml:mtext>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>,</mml:mo>
<mml:mi>G</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mtext> is used in the parse</mml:mtext>
<mml:mo>,</mml:mo>
<mml:mi>S</mml:mi>
<mml:mo></mml:mo>
<mml:msup>
<mml:mtext mathvariant="bold">O</mml:mtext>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>|</mml:mo>
<mml:mi>G</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mo>=</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo></mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
</mml:munder>
<mml:mrow>
<mml:mi>e</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
<mml:mi>f</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mstyle>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mo>=</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo></mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
</mml:munder>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo></mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mi>a</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mi>e</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
<mml:mi>e</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo>\</mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
<mml:mi>M</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo>\</mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
<mml:mi>f</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mstyle>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</disp-formula>
<p>and</p>
<disp-formula id="E27">
<label>(21)</label>
<mml:math id="M40">
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:msub>
<mml:mo>Γ</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mtext mathvariant="bold">O</mml:mtext>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>,</mml:mo>
<mml:mi>G</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo></mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
<mml:mtext> is used</mml:mtext>
<mml:mo>,</mml:mo>
<mml:mi>S</mml:mi>
<mml:mo></mml:mo>
<mml:msup>
<mml:mtext mathvariant="bold">O</mml:mtext>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>|</mml:mo>
<mml:mi>G</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mo>=</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo></mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
</mml:munder>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo></mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mi>a</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mi>e</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
<mml:mi>e</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo>\</mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
<mml:mi>M</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo>\</mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
<mml:mi>f</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mstyle>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</disp-formula>
<p>and then, by the Bayes rule:</p>
<disp-formula id="E28">
<label>(22)</label>
<mml:math id="M41">
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mi>a</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo></mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo>|</mml:mo>
<mml:mi>i</mml:mi>
<mml:mtext> is used</mml:mtext>
<mml:mo>,</mml:mo>
<mml:mi>S</mml:mi>
<mml:mo></mml:mo>
<mml:msup>
<mml:mtext mathvariant="bold">O</mml:mtext>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>,</mml:mo>
<mml:mi>G</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo></mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
<mml:mtext> is used</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mtext> is used</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mfrac>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mo>Γ</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mtext mathvariant="bold">O</mml:mtext>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>,</mml:mo>
<mml:mi>G</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mo>Γ</mml:mo>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mtext mathvariant="bold">O</mml:mtext>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>,</mml:mo>
<mml:mi>G</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
<mml:mo>.</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</disp-formula>
<p>Following Lari and Young (
<xref ref-type="bibr" rid="B37">1990</xref>
) if there are
<italic>N</italic>
episodes that are memorized,
<italic>a</italic>
(
<italic>i</italic>
,
<italic>j</italic>
,
<italic>k</italic>
) can be computed as:</p>
<disp-formula id="E29">
<label>(23)</label>
<mml:math id="M42">
<mml:mrow>
<mml:mi>a</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:msub>
<mml:mo></mml:mo>
<mml:mi>n</mml:mi>
</mml:msub>
<mml:mrow>
<mml:msub>
<mml:mo>Γ</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mstyle mathvariant="bold" mathsize="normal">
<mml:mtext mathvariant="bold">O</mml:mtext>
</mml:mstyle>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:msub>
<mml:mo></mml:mo>
<mml:mi>n</mml:mi>
</mml:msub>
<mml:mrow>
<mml:msub>
<mml:mo>Γ</mml:mo>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mstyle mathvariant="bold" mathsize="normal">
<mml:mtext mathvariant="bold">O</mml:mtext>
</mml:mstyle>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where each term in the sum refers to one of the episodes.</p>
</sec>
</sec>
<sec id="s6">
<label>2.9</label>
<title>Online learning</title>
<p>The previous sections show how an iterative EM algorithm can be defined: first, the semantic transition matrix is randomly initialized, then, in the E-step, the inside probabilities are first computed
<italic>bottom-up</italic>
with Eq.
<xref ref-type="disp-formula" rid="E21">15</xref>
, from the data and the current value of
<italic>a</italic>
(
<italic>i</italic>
,
<italic>j</italic>
,
<italic>k</italic>
), then the outside probabilities are computed
<italic>top-down</italic>
with Eq.
<xref ref-type="disp-formula" rid="E23">17</xref>
. In the M-step the
<italic>a</italic>
matrix is re-evaluated by means of Eq.
<xref ref-type="disp-formula" rid="E28">22</xref>
. Optimization takes place by multiple EM iterations. These equations, however, presuppose
<italic>batch</italic>
learning, that is, all episodes are available for training the model at the same time. A more flexible framework, which can more closely reproduce the way actual memories are acquired, needs to be updated incrementally, one episode at a time. An incremental form of the algorithm, as delineated by Neal and Hinton (
<xref ref-type="bibr" rid="B51">1998</xref>
), can be defined in analogy with the online update rule for Hidden Markov Models defined by Baldi and Chauvin (
<xref ref-type="bibr" rid="B2">1994</xref>
): define</p>
<disp-formula id="E30">
<label>(24)</label>
<mml:math id="M43">
<mml:mrow>
<mml:mi>a</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msup>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mi>w</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:msup>
<mml:mi>j</mml:mi>
<mml:mo></mml:mo>
</mml:msup>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mi>k</mml:mi>
<mml:mo></mml:mo>
</mml:msup>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:msup>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mi>w</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mi>j</mml:mi>
<mml:mo></mml:mo>
</mml:msup>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mi>k</mml:mi>
<mml:mo></mml:mo>
</mml:msup>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</disp-formula>
<p>which fulfills by definition the normalization constraint of Eq.
<xref ref-type="disp-formula" rid="E19">14</xref>
. We now define an online update rule for the
<italic>w</italic>
matrix</p>
<disp-formula id="E31">
<label>(25)</label>
<mml:math id="M44">
<mml:mrow>
<mml:mo>Δ</mml:mo>
<mml:mi>w</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mo>η</mml:mo>
<mml:mrow>
<mml:mi>E</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mtext mathvariant="bold">O</mml:mtext>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mo>Γ</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo></mml:mo>
<mml:msub>
<mml:mo>Γ</mml:mo>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mi>a</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>or for small
<italic>η</italic>
:</p>
<disp-formula id="E32">
<label>(26)</label>
<mml:math id="M45">
<mml:mrow>
<mml:mo>Δ</mml:mo>
<mml:mi>a</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo></mml:mo>
<mml:mfrac>
<mml:mo>η</mml:mo>
<mml:mrow>
<mml:mi>E</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mtext mathvariant="bold">O</mml:mtext>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mo>Γ</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mi>a</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo></mml:mo>
<mml:msub>
<mml:mo>Γ</mml:mo>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:msup>
<mml:mi>a</mml:mi>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where, again,
<italic>η</italic>
is a rate parameter controlling the speed of the learning process. In analogy to Baldi and Chauvin's (
<xref ref-type="bibr" rid="B2">1994</xref>
) work, this rule will converge toward a (possibly local) maximum of
<italic>p</italic>
(
<bold>O</bold>
<sup>(
<italic>n</italic>
)</sup>
,
<italic>G</italic>
) = 
<italic>p</italic>
(
<italic>S</italic>
 → 
<bold>O</bold>
<sup>(
<italic>n</italic>
)</sup>
|
<italic>G</italic>
), the likelihood of model
<italic>G</italic>
(completely defined by
<italic>a</italic>
(
<italic>i</italic>
,
<italic>j</italic>
,
<italic>k</italic>
) given the observation
<bold>O</bold>
(
<italic>n</italic>
). As explained in the Results section, the two terms on the right hand side of Eq.
<xref ref-type="disp-formula" rid="E31">25</xref>
may be interpreted, respectively, as homosynaptic LTP and heterosynaptic LTD.</p>
<p>Note the role played by
<italic>E</italic>
(
<bold>O</bold>
<sup>(
<italic>n</italic>
)</sup>
) in the denominator of Eq.
<xref ref-type="disp-formula" rid="E32">26</xref>
: when the likelihood is low, for example when a novel episode is presented, the change in
<italic>w</italic>
will be relatively greater than when a well-learned episode (giving the current state of the model a high likelihood) is presented. Hence, the rule privileges learning of novel information with respect to familiar episodes.</p>
</sec>
<sec>
<label>2.10</label>
<title>Local form of the learning rule</title>
<p>We will show here how we can express the inside and outside probabilities, as well as the terms entering the optimization rule for the semantic transition matrix as sums and products over terms that can be computed locally at the nodes of a hierarchical neural network. This is an essential step in order to map our algorithm on a neural network model.</p>
<p>Let us consider first the inside probabilities: from Eq.
<xref ref-type="disp-formula" rid="E21">15</xref>
we have</p>
<disp-formula id="E33">
<mml:math id="M46">
<mml:mrow>
<mml:msup>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mstyle displaystyle="true">
<mml:mrow>
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:mrow>
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:msub>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo></mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mi>a</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:msup>
<mml:mi>M</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>
<p>having defined
<bold>Q</bold>
<sub>2</sub>
 = 
<bold>P</bold>
\Q
<sub>1</sub>
. By making use of Eq.
<xref ref-type="disp-formula" rid="E14">11</xref>
we then obtain:</p>
<disp-formula id="E34">
<mml:math id="M47">
<mml:mrow>
<mml:msup>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:msub>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo></mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo></mml:mo>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mi>a</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mi>e</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:mi>e</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo></mml:mo>
<mml:mi>T</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where
<italic>T</italic>
(
<bold>Q</bold>
<sub>1</sub>
,
<bold>Q</bold>
<sub>2</sub>
) is the set of all trees in which set
<bold>P</bold>
is split in sets
<bold>Q</bold>
<sub>1</sub>
and
<bold>Q</bold>
<sub>2</sub>
and then, by making use of Eq.
<xref ref-type="disp-formula" rid="E12">9</xref>
:</p>
<disp-formula id="E35">
<label>(27)</label>
<mml:math id="M48">
<mml:mtable columnalign="right">
<mml:mtr>
<mml:mtd>
<mml:msup>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mtext>(</mml:mtext>
<mml:mi>n</mml:mi>
<mml:mtext>)</mml:mtext>
</mml:mrow>
</mml:msup>
<mml:mtext>(</mml:mtext>
<mml:mi>i</mml:mi>
<mml:mtext>,P)</mml:mtext>
<mml:mo>=</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mn>3</mml:mn>
</mml:msub>
<mml:mo></mml:mo>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mn>1</mml:mn>
</mml:msup>
<mml:mo stretchy="false">(</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mn>3</mml:mn>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:msup>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo></mml:mo>
<mml:mo>β</mml:mo>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mtext mathvariant="bold">O</mml:mtext>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>\</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mstyle>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo></mml:mo>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mn>0</mml:mn>
</mml:msup>
<mml:mo stretchy="false">(</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo></mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mo>×</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:msub>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo></mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mi>a</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mstyle>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mn>3</mml:mn>
</mml:msub>
<mml:mo></mml:mo>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mn>1</mml:mn>
</mml:msup>
<mml:mo stretchy="false">(</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mn>3</mml:mn>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:msup>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo></mml:mo>
<mml:mo>β</mml:mo>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mtext mathvariant="bold">O</mml:mtext>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>\</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:msup>
<mml:msup>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo></mml:mo>
<mml:mo>β</mml:mo>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mstyle>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mo>×</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mn>1</mml:mn>
<mml:mo></mml:mo>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mn>0</mml:mn>
</mml:msup>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mstyle>
<mml:mtext></mml:mtext>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mn>1</mml:mn>
<mml:mo></mml:mo>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mn>0</mml:mn>
</mml:msup>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mstyle>
<mml:mo>.</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</disp-formula>
<p>Here,
<italic>T</italic>
<sup>0</sup>
(
<bold>P</bold>
) denotes the
<italic>inside trees</italic>
of
<bold>P</bold>
, i.e., the set of trees spanning the ordered subset
<bold>P</bold>
, and
<italic>T</italic>
<sup>1</sup>
(
<italic>P</italic>
) is the set of
<italic>outside trees</italic>
of
<bold>P</bold>
, i.e., the set of sub-trees starting from
<italic>S</italic>
and spanning
<bold>O</bold>
<sup>(
<italic>n</italic>
)</sup>
\P, plus an extra terminal node. Let us now define the reduced inside probabilities</p>
<disp-formula id="E36">
<label>(28)</label>
<mml:math id="M49">
<mml:mrow>
<mml:mover accent="true">
<mml:mi>e</mml:mi>
<mml:mo>˜</mml:mo>
</mml:mover>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mstyle displaystyle="true">
<mml:mrow>
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo></mml:mo>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mn>0</mml:mn>
</mml:msup>
<mml:mo stretchy="false">(</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mi>e</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>
<p>we then have</p>
<disp-formula id="E37">
<label>(29)</label>
<mml:math id="M50">
<mml:mrow>
<mml:mover accent="true">
<mml:mi>e</mml:mi>
<mml:mo>˜</mml:mo>
</mml:mover>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mstyle displaystyle="true">
<mml:mrow>
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:mrow>
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:msub>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo></mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mi>a</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:msup>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo></mml:mo>
<mml:mo>β</mml:mo>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:msup>
<mml:mover accent="true">
<mml:mi>e</mml:mi>
<mml:mo>˜</mml:mo>
</mml:mover>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:mover accent="true">
<mml:mi>e</mml:mi>
<mml:mo>˜</mml:mo>
</mml:mover>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
<mml:mo>.</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>
<p>By induction we can then prove that</p>
<disp-formula id="E38">
<label>(30)</label>
<mml:math id="M51">
<mml:mrow>
<mml:mover accent="true">
<mml:mi>e</mml:mi>
<mml:mo>˜</mml:mo>
</mml:mover>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mstyle displaystyle="true">
<mml:mrow>
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo></mml:mo>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mn>0</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi>l</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo stretchy="false">(</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:mrow>
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo></mml:mo>
<mml:mi>N</mml:mi>
<mml:mi>T</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mi>a</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:msup>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo></mml:mo>
<mml:mo>β</mml:mo>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mtext mathvariant="bold">C</mml:mtext>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>,</mml:mo>
<mml:mi>t</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">C</mml:mtext>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>,</mml:mo>
<mml:mi>t</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where
<italic>c</italic>
<sub>1</sub>
(
<italic>i</italic>
),
<italic>c</italic>
<sub>2</sub>
(
<italic>i</italic>
) are the left and right children of node
<italic>i</italic>
in tree
<italic>t</italic>
(with labeled non-terminals) and
<bold>C</bold>
(
<italic>i</italic>
,
<italic>t</italic>
) is the subset of terminals generated by node
<italic>i</italic>
in tree
<italic>t</italic>
.
<italic>NT</italic>
(
<italic>t</italic>
) is the set of labeled non-terminals nodes of tree
<italic>t</italic>
.
<italic>T</italic>
<sup>0,
<italic>l</italic>
</sup>
(
<bold>P</bold>
,
<italic>i</italic>
) is the set of trees with labeled non-terminals with root labeled with
<italic>i</italic>
and generating
<bold>P</bold>
.</p>
<p>Similarly, for the outside probabilities, from Eq.
<xref ref-type="disp-formula" rid="E23">17</xref>
:</p>
<disp-formula id="E39">
<label>(31)</label>
<mml:math id="M52">
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mi>f</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mstyle displaystyle="true">
<mml:mrow>
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo>:</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo></mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo></mml:mo>
<mml:msup>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>a</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:msup>
<mml:mi>M</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mrow>
<mml:mo>+</mml:mo>
<mml:mi>a</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:msup>
<mml:mi>M</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mi>f</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>.</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</disp-formula>
<p>Where we assumed
<bold>P</bold>
<sub>1</sub>
 = 
<bold>P</bold>
and
<bold>P</bold>
<sub>2</sub>
 = 
<bold>O</bold>
<sup>(
<italic>n</italic>
)</sup>
P.</p>
<p>Inserting Eq.
<xref ref-type="disp-formula" rid="E14">11</xref>
in the previous formula we obtain</p>
<disp-formula id="E40">
<label>(32)</label>
<mml:math id="M53">
<mml:mtable columnalign="left">
<mml:mtr>
<mml:mtd>
<mml:msup>
<mml:mi>f</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mo>=</mml:mo>
<mml:mstyle displaystyle="true">
<mml:mrow>
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo>:</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo></mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo></mml:mo>
<mml:msup>
<mml:mtext mathvariant="bold">O</mml:mtext>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mtext mathvariant="bold">n</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo></mml:mo>
<mml:mi>T</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mfrac>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo></mml:mo>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mi>a</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mstyle>
<mml:mo>+</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo></mml:mo>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mi>a</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mo>×</mml:mo>
<mml:mtext></mml:mtext>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mi>f</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>.</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</disp-formula>
<p>
<italic>T</italic>
(
<bold>P</bold>
<sub>1</sub>
,
<bold>P</bold>
<sub>2</sub>
) is the set of possible trees which generate
<bold>Q</bold>
and split it into
<bold>P</bold>
<sub>1</sub>
and
<bold>P</bold>
<sub>2</sub>
. By the property defined in Eq.
<xref ref-type="disp-formula" rid="E12">9</xref>
:</p>
<disp-formula id="E41">
<mml:math id="M54">
<mml:mtable columnalign="left">
<mml:mtr>
<mml:mtd>
<mml:msup>
<mml:mi>f</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mstyle displaystyle="true">
<mml:mrow>
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo>:</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo></mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo></mml:mo>
<mml:msup>
<mml:mtext mathvariant="bold">O</mml:mtext>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo></mml:mo>
<mml:mi>T</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mo>×</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo></mml:mo>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mn>1</mml:mn>
</mml:msup>
<mml:mo stretchy="false">(</mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>a</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:msup>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo></mml:mo>
<mml:mo>β</mml:mo>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:msup>
<mml:mo>+</mml:mo>
<mml:mi>a</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:msup>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo></mml:mo>
<mml:mo>β</mml:mo>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mo>×</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo></mml:mo>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mn>0</mml:mn>
</mml:msup>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mn>3</mml:mn>
</mml:msub>
<mml:mo></mml:mo>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mn>0</mml:mn>
</mml:msup>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mi>f</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mstyle>
<mml:mo>+</mml:mo>
<mml:mi>s</mml:mi>
<mml:mi>w</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>p</mml:mi>
<mml:mtext>.</mml:mtext>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</disp-formula>
<p>“Swap” here refers to the same terms with swapped roles for
<bold>P</bold>
<sub>1</sub>
and
<bold>P</bold>
<sub>2</sub>
We can define the reduced outside probabilities
<inline-formula>
<mml:math id="M55">
<mml:mrow>
<mml:mover accent="true">
<mml:mi>f</mml:mi>
<mml:mo>˜</mml:mo>
</mml:mover>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mstyle scriptlevel="+1">
<mml:mfrac>
<mml:mrow>
<mml:mi>f</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo></mml:mo>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mn>0</mml:mn>
</mml:msup>
<mml:mo stretchy="false">(</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msub>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
<mml:mo>.</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>
Eq.
<xref ref-type="disp-formula" rid="E24">18</xref>
gives us an initialization condition for the
<italic>f</italic>
:</p>
<disp-formula id="E42">
<label>(33)</label>
<mml:math id="M56">
<mml:mrow>
<mml:mover accent="true">
<mml:mi>f</mml:mi>
<mml:mo>˜</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mtext mathvariant="bold">O</mml:mtext>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo></mml:mo>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mn>0</mml:mn>
</mml:msup>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mtext mathvariant="bold">O</mml:mtext>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mtext mathvariant="bold">n</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</disp-formula>
<p>We then have</p>
<disp-formula id="E43">
<label>(34)</label>
<mml:math id="M57">
<mml:mtable columnalign="left">
<mml:mtr>
<mml:mtd>
<mml:mover accent="true">
<mml:mi>f</mml:mi>
<mml:mo>˜</mml:mo>
</mml:mover>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo></mml:mo>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mn>0</mml:mn>
</mml:msup>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mtext mathvariant="bold">O</mml:mtext>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mtext mathvariant="bold">n</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mfrac>
<mml:mstyle displaystyle="true">
<mml:mrow>
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo>:</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo></mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo></mml:mo>
<mml:msup>
<mml:mtext mathvariant="bold">O</mml:mtext>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mtext mathvariant="bold">n</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>a</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:msup>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo></mml:mo>
<mml:mo>β</mml:mo>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:msup>
<mml:mo>+</mml:mo>
<mml:mi>a</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:msup>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo></mml:mo>
<mml:mo>β</mml:mo>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mo>×</mml:mo>
<mml:mover accent="true">
<mml:mi>e</mml:mi>
<mml:mo>˜</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mover accent="true">
<mml:mi>f</mml:mi>
<mml:mo>˜</mml:mo>
</mml:mover>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>+</mml:mo>
<mml:mtext>swap.</mml:mtext>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</disp-formula>
<p>By induction:</p>
<disp-formula id="E44">
<label>(35)</label>
<mml:math id="M58">
<mml:mrow>
<mml:mover accent="true">
<mml:mi>f</mml:mi>
<mml:mo>˜</mml:mo>
</mml:mover>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo></mml:mo>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mn>0</mml:mn>
<mml:mo stretchy="false">(</mml:mo>
<mml:mtext mathvariant="bold">n</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mfrac>
<mml:mstyle displaystyle="true">
<mml:mrow>
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo></mml:mo>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi>l</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo stretchy="false">(</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:mrow>
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo></mml:mo>
<mml:mi>N</mml:mi>
<mml:mi>T</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mi>a</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:msup>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo></mml:mo>
<mml:mo>β</mml:mo>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mtext mathvariant="bold">C</mml:mtext>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>,</mml:mo>
<mml:mi>t</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">C</mml:mtext>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>,</mml:mo>
<mml:mi>t</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where now
<italic>T</italic>
<sup>1,
<italic>l</italic>
</sup>
(
<bold>P</bold>
,
<italic>i</italic>
) is the set of outside trees of
<bold>P</bold>
where the extra terminal has exactly label
<italic>i</italic>
. We can now express Γ
<italic>
<sub>i</sub>
</italic>
as follows:</p>
<disp-formula id="E45">
<label>(36)</label>
<mml:math id="M59">
<mml:mtable columnalign="left">
<mml:mtr>
<mml:mtd>
<mml:msub>
<mml:mo>Γ</mml:mo>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo></mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
</mml:munder>
<mml:mrow>
<mml:mi>f</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
<mml:mi>e</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mstyle>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mo>=</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo></mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
</mml:munder>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>f</mml:mi>
<mml:mo>˜</mml:mo>
</mml:mover>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
<mml:mover accent="true">
<mml:mi>e</mml:mi>
<mml:mo>˜</mml:mo>
</mml:mover>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mstyle>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo></mml:mo>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mn>0</mml:mn>
<mml:mo stretchy="false">(</mml:mo>
<mml:mtext mathvariant="bold">n</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mfrac>
<mml:mstyle displaystyle="true">
<mml:mrow>
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo></mml:mo>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mi>l</mml:mi>
</mml:msup>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mtext mathvariant="bold">O</mml:mtext>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mtext mathvariant="bold">n</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:mrow>
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo></mml:mo>
<mml:mi>N</mml:mi>
<mml:mi>T</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mi>a</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:msup>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo></mml:mo>
<mml:mo>β</mml:mo>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mtext mathvariant="bold">C</mml:mtext>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>,</mml:mo>
<mml:mi>t</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">C</mml:mtext>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>,</mml:mo>
<mml:mi>t</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</disp-formula>
<p>where the sum runs on the set
<italic>T
<sup>l</sup>
</italic>
(
<bold>O</bold>
<sup>(
<italic>n</italic>
)</sup>
,
<italic>i</italic>
) of all trees with labeled non-terminals in which one non-terminal is labeled
<italic>i</italic>
, and we used the fact that
<inline-formula>
<mml:math id="M60">
<mml:mrow>
<mml:mstyle scriptlevel="+1">
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">(</mml:mo>
<mml:msup>
<mml:mstyle mathvariant="bold" mathsize="normal">
<mml:mtext mathvariant="bold">O</mml:mtext>
</mml:mstyle>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>,</mml:mo>
<mml:mi>G</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
<mml:mstyle scriptlevel="+1">
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo></mml:mo>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mn>0</mml:mn>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:msub>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
<mml:mo>=</mml:mo>
<mml:mover accent="true">
<mml:mi>e</mml:mi>
<mml:mo>˜</mml:mo>
</mml:mover>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>S</mml:mi>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mstyle mathvariant="bold" mathsize="normal">
<mml:mtext mathvariant="bold">O</mml:mtext>
</mml:mstyle>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>.</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>
</p>
<p>Also,</p>
<disp-formula id="E46">
<label>(37)</label>
<mml:math id="M61">
<mml:mtable columnalign="left">
<mml:mtr>
<mml:mtd>
<mml:msub>
<mml:mo>Γ</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mo>=</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo></mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
</mml:munder>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:msub>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo></mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mi>a</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mi>e</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:mi>M</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:mi>f</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mstyle>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mo>=</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo></mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
</mml:munder>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:msub>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo></mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mi>a</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mover accent="true">
<mml:mi>e</mml:mi>
<mml:mo>˜</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mover accent="true">
<mml:mi>e</mml:mi>
<mml:mo>˜</mml:mo>
</mml:mover>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:mi>M</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:mover accent="true">
<mml:mi>f</mml:mi>
<mml:mo>˜</mml:mo>
</mml:mover>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mstyle>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mo>×</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo></mml:mo>
<mml:msub>
<mml:mi>T</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
<mml:mtext mathvariant="bold">P</mml:mtext>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo></mml:mo>
<mml:msub>
<mml:mi>T</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
<mml:msub>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
<mml:mstyle displaystyle="true">
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mn>3</mml:mn>
</mml:msub>
<mml:mo></mml:mo>
<mml:msub>
<mml:mi>T</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
<mml:msub>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mn>3</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mfrac>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mo>=</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo></mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
</mml:munder>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:msub>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo></mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mi>a</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mover accent="true">
<mml:mi>e</mml:mi>
<mml:mo>˜</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mover accent="true">
<mml:mi>e</mml:mi>
<mml:mo>˜</mml:mo>
</mml:mover>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:msup>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo></mml:mo>
<mml:mo>β</mml:mo>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">Q</mml:mtext>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:msup>
<mml:mover accent="true">
<mml:mi>f</mml:mi>
<mml:mo>˜</mml:mo>
</mml:mover>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mstyle>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:msub>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo></mml:mo>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mrow></mml:mrow>
</mml:msup>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mfrac>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo></mml:mo>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mi>l</mml:mi>
</mml:msup>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mtext mathvariant="bold">O</mml:mtext>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo></mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>k</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:mrow>
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo></mml:mo>
<mml:mi>N</mml:mi>
<mml:mi>T</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mi>a</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:msup>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo></mml:mo>
<mml:mo>β</mml:mo>
<mml:mi>M</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mtext mathvariant="bold">C</mml:mtext>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>,</mml:mo>
<mml:mi>t</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">C</mml:mtext>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>,</mml:mo>
<mml:mi>t</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mstyle>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</disp-formula>
<p>where
<italic>T
<sup>l</sup>
</italic>
(
<bold>O</bold>
<sup>(
<italic>n</italic>
)</sup>
,
<italic>i</italic>
 → 
<italic>j</italic>
,
<italic>k</italic>
) denotes the set of trees with labeled non-terminals in which one non-terminal is labeled
<italic>i</italic>
and has, as left and right children, respectively
<italic>j</italic>
and
<italic>k</italic>
. Thus Γ
<italic>
<sub>i</sub>
</italic>
and Γ
<italic>
<sub>ijk</sub>
</italic>
can be expressed as sums of terms locally computable at each node according to Eqs
<xref ref-type="disp-formula" rid="E5">3</xref>
<xref ref-type="disp-formula" rid="E9">7</xref>
.</p>
</sec>
<sec id="s7">
<label>2.11</label>
<title>Optimal parsing</title>
<p>After training, in order to test the performance of our semantic memory model at retrieval, it is handy to compute what the optimal parsing tree is for a given episode, given the current state of the model, that is of the
<italic>a</italic>
matrix. While this would be straightforward in a parallel neural network which could estimate the likelihood of all trees at the same time, it is much harder to accomplish in computer simulations. A natural definition of an optimal tree is the tree
<italic>t</italic>
that maximizes the tree probability</p>
<disp-formula id="E47">
<label>(38)</label>
<mml:math id="M62">
<mml:mrow>
<mml:msub>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>e</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mstyle displaystyle="true">
<mml:mrow>
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo></mml:mo>
<mml:mi>N</mml:mi>
<mml:mi>T</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mi>a</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>t</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:msup>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo></mml:mo>
<mml:mo>β</mml:mo>
<mml:mi>M</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mtext mathvariant="bold">C</mml:mtext>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>,</mml:mo>
<mml:mi>t</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">C</mml:mtext>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>,</mml:mo>
<mml:mi>t</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:math>
</disp-formula>
<p>However, finding the maximum of this expression requires a search over all possible binary trees with labeled terminals and non-terminals, a huge, impractical number. Following Goodman (
<xref ref-type="bibr" rid="B24">1996</xref>
), we adopted an alternative definition of optimality, that is the tree
<italic>t</italic>
that maximizes the number of correctly labeled constituents, that is</p>
<disp-formula id="E48">
<label></label>
<mml:math id="M63">
<mml:mrow>
<mml:mi>L</mml:mi>
<mml:mtext mathvariant="bold">C</mml:mtext>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mstyle displaystyle="true">
<mml:mrow>
<mml:munder>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo></mml:mo>
<mml:mi>N</mml:mi>
<mml:mi>T</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">C</mml:mtext>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mi>f</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="bold">C</mml:mtext>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo></mml:mo>
<mml:msup>
<mml:mtext mathvariant="bold">O</mml:mtext>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo></mml:mo>
<mml:mtext mathvariant="bold">P</mml:mtext>
<mml:mo>|</mml:mo>
<mml:mi>G</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:math>
</disp-formula>
<p>This measure tells us how many constituents (subsets of the entire episode) are correctly labeled, that is, their elements are the terminal nodes of a subtree having the “correct” (according to the generative model) non-terminal as the root. This measure can be computed from the inside and outside probabilities (which can be computed efficiently), and can be maximized with a simple dynamic programming algorithm, as described by Goodman (
<xref ref-type="bibr" rid="B24">1996</xref>
). This procedure was used to estimate the optimal trees displayed in Figures
<xref ref-type="fig" rid="F4">4</xref>
C,
<xref ref-type="fig" rid="F5">5</xref>
C,
<xref ref-type="fig" rid="F6">6</xref>
D, and
<xref ref-type="fig" rid="F8">8</xref>
C.</p>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption>
<p>
<bold>Simulation of bottom-up influences and decontextualization</bold>
.
<bold>(A)</bold>
Training set, composed of 8 episodes, each composed of 5 items.
<bold>(B)</bold>
Test episode (not used for training).
<bold>(C)</bold>
Optimal parsing trees computed by the semantic model after 5, 50, 100, 150 iterations.
<bold>(D)</bold>
Time course of the log-likelihood of each of the parsing trees in
<bold>(C)</bold>
: the optimal ones after 5 iteration (dash/dotted lined), 50 iterations (dashed line), 100 iterations (dotted line), 150 iterations (solid line).</p>
</caption>
<graphic xlink:href="fncom-05-00036-g004"></graphic>
</fig>
<fig id="F5" position="float">
<label>Figure 5</label>
<caption>
<p>
<bold>Simulation of top-down influences</bold>
.
<bold>(A)</bold>
Training set, composed of 11 5-item episodes (gray background), and a “probe” episode (white background). The groups (4,5) and (6,7), which occur in the same-context in different episodes, are highlighted by the ellipses. The last training episode is termed prototype, because it contains the same items as the test episode, except for a (4,5) to (6,7) substitution.
<bold>(B)</bold>
Time course of log-probability of the transition elements from non-terminal node 11 to the groups (4,5) and (6,7) are indicated, respectively, by crosses and circles. The outside probability that 11 generates (6,7) in the second training episode is also shown (solid line).
<bold>(C)</bold>
Optimal parse trees for the test episode after 1000 and 3000 iterations and for the prototype episode after 1000 iterations.
<bold>(D)</bold>
Cross-validation test of potential over-fitting behavior. From the grammar model trained in the simulation for
<bold>(B,C)</bold>
, 48 episodes were drawn. The first 40 were used for training a new grammar model for 2000 iterations. The remaining 8 were used for cross-validation. The figure shows the time of evolution average log-likelihood for the training set (solid line), the cross-validation set (dashed line) and a scrambled control set obtained by randomly shuffling the item labels from episodes drawn from the same grammar model as the previous ones.</p>
</caption>
<graphic xlink:href="fncom-05-00036-g005"></graphic>
</fig>
<fig id="F6" position="float">
<label>Figure 6</label>
<caption>
<p>
<bold>Serial recall of lists</bold>
.
<bold>(A)</bold>
Episodes are composed of 5 items, in temporal sequence.
<bold>(B)</bold>
Temporal association as a function of temporal interval between two items.
<bold>(C)</bold>
Color code representation of the temporal confusion matrix after 1, 6, and 20 iterations. The confusion matrix is the probability of retrieving item X at position Y. Therefore, a diagonal matrix denotes perfect sequential recall.
<bold>(D)</bold>
Optimal parsing tree for the sequence after 20 iterations.</p>
</caption>
<graphic xlink:href="fncom-05-00036-g006"></graphic>
</fig>
</sec>
<sec id="s8">
<label>2.12</label>
<title>Familiarity and episode likelihood</title>
<p>Another important measure of recall is the familiarity of a newly experienced episode. In a generative model framework, this translates into the likelihood that the generative model produces that episode. By definition this corresponds to the inside probability:</p>
<disp-formula id="E49">
<label>(40)</label>
<mml:math id="M64">
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mtext mathvariant="bold">O</mml:mtext>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>,</mml:mo>
<mml:mi>G</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mtext mathvariant="bold">O</mml:mtext>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mi>E</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mtext mathvariant="bold">O</mml:mtext>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>
<p>so that familiarity can also be effectively computed. The same formula can also be used to simulate the behavior of the model in production, as for example in the simulations of Figures
<xref ref-type="fig" rid="F6">6</xref>
and
<xref ref-type="fig" rid="F7">7</xref>
.</p>
<fig id="F7" position="float">
<label>Figure 7</label>
<caption>
<p>
<bold>Free recall of lists</bold>
.
<bold>(A)</bold>
An episode composed of the items 1–5 is learned in a certain context, other distractor episodes are learned without that context.
<bold>(B)</bold>
Probability of recall of items in the context and with no-context after 10, 40, and 100 iterations.
<bold>(C)</bold>
Confusion matrices for the (1,2,3,4,5) episode in the original context and with no-context.</p>
</caption>
<graphic xlink:href="fncom-05-00036-g007"></graphic>
</fig>
</sec>
<sec>
<label>2.13</label>
<title>Simulation parameters</title>
<p>For all simulations, the parameter
<italic>β</italic>
was set at 3, and the learning rate
<italic>η</italic>
was 0.2. The number of non-terminals was 20 for the simulations of Figures
<xref ref-type="fig" rid="F4">4</xref>
,
<xref ref-type="fig" rid="F6">6</xref>
, and
<xref ref-type="fig" rid="F7">7</xref>
, 30 for the simulations of Figure
<xref ref-type="fig" rid="F8">8</xref>
, and it was 40 for the simulations in Figure
<xref ref-type="fig" rid="F5">5</xref>
. Each iteration consisted of two runs of the EM algorithm with the same randomly selected episode.</p>
<fig id="F8" position="float">
<label>Figure 8</label>
<caption>
<p>
<bold>Learning of artificial grammars</bold>
.
<bold>(A)</bold>
Example syllable configurations from Grammar 1 (of the form A-X-B, C-X-D) and Grammar 2 (A-X-D, C-X-B).
<bold>(B)</bold>
Average log-likelihood assigned by the model to examples from Grammar 1 (solid line) and 2 (dashed line) as a function of the number of training iterations. In the first 500 iterations episodes from grammar 1 (left) or the unpredictive grammar (right) are presented.
<bold>(C)</bold>
Optimal parse trees for Grammar 1 after 500 iterations (top) and for Grammar 2 after 1000 iterations (bottom).</p>
</caption>
<graphic xlink:href="fncom-05-00036-g008"></graphic>
</fig>
</sec>
</sec>
<sec>
<label>3</label>
<title>Results</title>
<sec>
<label>3.1</label>
<title>Consolidation of bottom-up structures and decontextualization</title>
<p>The model successfully learned complex structures present in episodic memory, and by virtue of this, it can account for several phenomena observed in memory consolidation. First, consolidated memories tend to lose their dependence on context with time, as shown for fear conditioning and socially acquired food preference (Winocur et al.,
<xref ref-type="bibr" rid="B87">2007</xref>
) and as predicted by the “transformational” theories of memory consolidation (Moscovitch et al.,
<xref ref-type="bibr" rid="B50">2005</xref>
,
<xref ref-type="bibr" rid="B49">2006</xref>
).</p>
<p>Because of the properties of tree graphs, which embody the “context-free” character of SCFGs, decontextualization is obtained naturally in this model. This can be seen in simulations as follows: the model is trained on a set of 8 “episodes” (Figure
<xref ref-type="fig" rid="F4">4</xref>
A), each represented as a configuration of 5 items. As above, we displayed association between items as proximity in the 2-D plane. In 5 out of 8 episodes, the group of items (1, 2, 3) appears in a strongly associated form, but in 5 different contexts (different combinations of the other items completing the episode). The (1,2,3) group may represent stimuli that are closely related to each other (for example, a tone and a shock in a fear conditioning paradigm).</p>
<p>The consolidation algorithm is trained for 400 iterations, and in each iteration one of the training episodes, randomly chosen, is used to simulate reactivation. With time, a context-independent representation develops. To assess this, we test the resulting memory structure on a new episode (Figure
<xref ref-type="fig" rid="F4">4</xref>
B) which was not in the training set. In this test episode, items 1, 2, 3 appear, but the structure of the inter-items associations does not make it evident that they belong to the same group, as item 6 has in fact stronger associations with items 1, 2, and 3 than these latter have with each other. This is similar to a situation in which a distractor stimulus is interleaved between the Conditioned Stimulus (e.g., a tone) and the Unconditioned Stimulus (e.g., a shock). At the beginning of training (5 iterations), the optimal parse tree for the test episode (see Section
<xref ref-type="sec" rid="s7">2.11</xref>
for definition), reflects episodic associations only (Figure
<xref ref-type="fig" rid="F4">4</xref>
C): in this tree, item 1 is first grouped with item 6 (the distractor), item 3 with item 5, and item 2 is isolated. Thus, the optimal parsing is completely dominated by the associations as they are perceived in that particular episode. After 50 iterations, items 1 and 2 are grouped together, after 100 iterations, items 1, 2, and 3 are grouped together in one subtree, and this is maintained after 150 iterations and afterward. The key to this behavior is the evolution of the transition probabilities for “hidden” or non-terminal nodes: with consolidation, non-terminal node 20 becomes increasingly likely to generate nodes 1 and 2, node 13 (which is the root node of the subtree encoding the (1,2,3) group) has a large probability of generating nodes 20 and 3. With time the semantic transition matrix
<italic>a</italic>
is shaped so that this emerging parsing (the optimal one after 150 iterations and later) remains the only one with a consistently high likelihood, with the other parsing trees becoming less and less likely (Figure
<xref ref-type="fig" rid="F4">4</xref>
D). Thus, nodes 13 and 20 create a higher-order representation of the “concept” of group 1, 2, 3, as they could be stored for the long-term in associational, or prefrontal cortical areas (Takashima et al.,
<xref ref-type="bibr" rid="B75">2006</xref>
). Such representation can be activated regardless of the precise context (or the remainder of the episode), and purely by bottom-up signals.</p>
</sec>
<sec>
<label>3.2</label>
<title>Inductive reasoning and top-down learning of categories</title>
<p>While in the previous example the “inside” probabilities, that is, bottom-up processing, are critical for the outcome, much of the model's power in complex situations arises from top-down processing. One situation in which this is visible is when different items happen to often occur in similar contexts. Then a common representation should emerge generalizing across all these items. For example, we recognize an object as a “hat,” however funny shaped it is, because we observe it on somebody's head similar to “Bayesian model merging” (see e.g., Stolcke and Omohundro,
<xref ref-type="bibr" rid="B73">1994</xref>
). A related cognitive task is tapped by the Advanced Progressive Matrices test of inductive reasoning (Raven et al.,
<xref ref-type="bibr" rid="B59">1998</xref>
), where a pattern has to be extracted from a number of sequential examples, and then used to complete a new examples. Performance in this task has been seen to correlate with spindle activity during slow-wave sleep (Schabus et al.,
<xref ref-type="bibr" rid="B64">2006</xref>
).</p>
<p>In this simulation, item groups (4,5) and (6,7) appear repeatedly interchangeably in different contexts in successive training episodes (Figure
<xref ref-type="fig" rid="F5">5</xref>
A). A common representation for the two groups develops, in the form of the non-terminal node 11. With training, this node sees its semantic transition probabilities to both of these groups increase (Figure
<xref ref-type="fig" rid="F5">5</xref>
B). As a result, the model is capable of performing inferences on configurations it has never seen before. The probe episode (Figure
<xref ref-type="fig" rid="F5">5</xref>
A) has the same item composition as the last training episode (“prototype”), except for group (6,7), which substitutes (4,5). Moreover, the perceptual associations do not suggest pairing 6 and 7 (rather 1 with 6 and 2 with 7). Yet, after 3000 iterations, the optimal parsing tree couples 6 and 7 through the activation of the non-terminal 11, which is driven by top-down influences (Figure
<xref ref-type="fig" rid="F5">5</xref>
C), in a parsing tree identical to what was computed for the prototype episode. The outside probabilities are critical for the build-up of the generalized representation: indeed, during training, the probability that 11 → (4,5) is the first to raise (Figure
<xref ref-type="fig" rid="F5">5</xref>
B). This leads to the increase of the top-down, outside probability
<italic>f</italic>
(11,(6,7)), for the episodes in which (6,7) appears en lieu of (4,5), forcing activation of node 11. Last, driven by these top-down effects, non-terminal node 11 acquires a high probability of generating (6,7) as well.</p>
<p>In a further simulation, we tested the generalization behavior of the model. We used the model resulting from the training on the episodes of (Figure
<xref ref-type="fig" rid="F5">5</xref>
A) to generate 48 episodes, and used the first 40 to train a new model, using the remaining 8 as a test set. Throughout 2000 training iterations, the log-likelihood for the training and test set rise with similar time courses, and reach levels much higher than for a shuffled control set of episodes (Figure
<xref ref-type="fig" rid="F5">5</xref>
D).</p>
</sec>
<sec>
<label>3.3</label>
<title>List learning, serial, and free recall</title>
<p>While not strictly a semantic memory operation, list learning is a very popular paradigm to investigate declarative memory, and has been an important benchmark for theories in the field (McClelland et al.,
<xref ref-type="bibr" rid="B45">1995</xref>
). We consider it here to assess our framework's ability to store strictly sequential information, which is an important case for memory storage, an ability that does not follow immediately from the model's definition. We assessed the behavior of our model in this task by considering two common experimental procedures: serial and contextual free recall. In serial recall, ordered lists are presented to the subjects, who have to recall them in the same order after a retention interval. We simulated the presentation of an ordered list by constructing an “episode” composed of 5 items (numbered from 1 to 5; Figure
<xref ref-type="fig" rid="F6">6</xref>
A), presented sequentially in time at regular intervals. These associations essentially reflect the temporal interval between the presentations of the respective items and they decay exponentially with that interval. To convey the notion of temporal ordering, we made the perceptual association matrix
<italic>s</italic>
asymmetrical, so that “backward” associations (associations from more recent to more remote items) decay more than three times faster than forward associations (Figure
<xref ref-type="fig" rid="F6">6</xref>
B; see Section
<xref ref-type="sec" rid="s1">2</xref>
). This form of association was inspired by similar patterns of confusion observed in human subjects during list retrieval (Kahana,
<xref ref-type="bibr" rid="B31">1996</xref>
) and could find a plausible implementation in neural circuits with synapses obeying a spike-timing-dependent plasticity rule (Markram et al.,
<xref ref-type="bibr" rid="B42">1997</xref>
; Bi and Poo,
<xref ref-type="bibr" rid="B5">1998</xref>
). We trained our model with several iterations of the consolidation algorithm described above, then, to simulate free retrieval, we used the resulting transition matrix
<italic>a</italic>
to compute the probabilities of “spontaneous” generation by the branching process of all permutations of the 5 items, when the episodic association matrix was set to uniform values (to mimic the absence of perceptual inputs). These probabilities are expressed in terms of a sum over all possible parsing trees having the 5 items in the target list as terminals, but can be efficiently calculated with the inside–outside algorithm (see Section
<xref ref-type="sec" rid="s1">2</xref>
). Then, we computed the confusion matrix
<italic>C</italic>
(
<italic>i</italic>
,
<italic>j</italic>
), representing the probability that item
<italic>i</italic>
is retrieved at the
<italic>j</italic>
-th position. With training, the model gradually evolved from chance-level performance to perfect serial recall (after 100 iterations; Figure
<xref ref-type="fig" rid="F6">6</xref>
C), indicated by a diagonal confusion matrix. This becomes possible because the optimal parsing tree (Figure
<xref ref-type="fig" rid="F6">6</xref>
D) as well as all of the most probable parsing trees generate the 5 items in the correct order.</p>
<p>In contextual free recall, subjects are shown a list of objects. After retention, they are placed in the same-context and asked to enumerate as many items in the original list as possible, regardless of their order. We simulated the training conditions by composing a target episode from a 5-item ordered list, with associations as described in Figures
<xref ref-type="fig" rid="F6">6</xref>
A,B. We added to the episode one “context” item, which had identical, symmetrical associations with the items in the sequence. In addition, the network is trained on 5 more 6-item episodes, composed of items that were not included in the target list, in different permutations (Figure
<xref ref-type="fig" rid="F7">7</xref>
A). After several iterations, we simulated the same-context retrieval condition by computing the probability of generating all 6-item episodes including the context item. As above, episodic associations were set to a uniform value, so that they were irrelevant. While at the beginning of training all items were equally likely to be generated, both after 40 and 100 iterations, the network only generated the 5 items in the target list (Figure
<xref ref-type="fig" rid="F7">7</xref>
B). Next, the no-context situation was simulated by restricting attention to the generation of 6-item episodes not including the context item, that is, we considered the generation probability conditional to no generation of the context item. Here, items that were not part of the target list were more likely to be retrieved than target items following training (Figure
<xref ref-type="fig" rid="F7">7</xref>
B). In the situation in which exactly the 5 target items were retrieved, we assessed whether retrieval mirrored the original order of item presentation. In the context condition, this was successfully accomplished after 100 iterations (Figure
<xref ref-type="fig" rid="F7">7</xref>
C). However, an erroneous order (3-4-5-1-2 in this simulation) was the most likely in the no-context condition, showing that the model is also making use of contextual information in order to correctly encode and retrieve episodes, even in this simplified setting.</p>
</sec>
<sec>
<label>3.4</label>
<title>Learning and generalizing across artificial grammars</title>
<p>Memory consolidation may modify the milieu new memories are stored in, facilitating encoding of memories for which a schema, or a mental framework, is already in place (Tse et al.,
<xref ref-type="bibr" rid="B80">2007</xref>
). We asked whether our model can account for such effects in simulations inspired by an experiment that has been performed on 18-month-old infants (Gómez et al.,
<xref ref-type="bibr" rid="B23">2006</xref>
). In the experiment, infants are familiarized with an artificial grammar (Grammar 1 in Figure
<xref ref-type="fig" rid="F8">8</xref>
A) in which artificial three-syllable words are composed such that syllable A in first position is predictive of syllable B in third position, and C in first position predicts D in third position. The second syllable in the word is independent from the others. In simulations, episodes with 6 items were presented: three fixed items encoded the syllable positions in the word, and association strength followed the same asymmetrical form as in Figure
<xref ref-type="fig" rid="F6">6</xref>
B. The remaining three items encoded the actual syllable identity (see Section
<xref ref-type="sec" rid="s1">2</xref>
). The model could learn to correctly parse words generated according to this grammar and to assign to them a high likelihood (Figure
<xref ref-type="fig" rid="F8">8</xref>
B, left). The optimal parsing trees depended on a few key transition rules: the “start” node
<italic>S</italic>
generated with high probability nodes 26 and 40, which correctly encoded the rule in the grammar. After training, a limited number of non-terminal nodes encoded, e.g., the association between position 1 and node A, as well as between position 2 and the multiple syllables that occurred in that position.</p>
<p>In the original experiment, after a retention period, infants were tested on words from a different grammar, in which the predictive associations were swapped (A predicts D, C predicts B; Grammar 2 in Figure
<xref ref-type="fig" rid="F8">8</xref>
A). Infants who were allowed to sleep during retention were more likely than non-sleeping infants to immediately familiarize with the new grammar (Gómez et al.,
<xref ref-type="bibr" rid="B23">2006</xref>
), suggesting a role of sleep-related processes in facilitating the genesis of a higher-order schema (position 1 predicts position 3, regardless of the exact tokens). The same effect was observed in our simulations as we switched training on Grammar 1 to Grammar 2: the average likelihood assigned by the model to Grammar 2 words climbed much faster than likelihood for Grammar 1 did at the beginning of the simulation, which started from a “blank slate” condition. Grammar 2 becomes very quickly preferred over Grammar 1 (Figure
<xref ref-type="fig" rid="F8">8</xref>
B). In fact, assigned likelihood is the closest analog in our model to a measure of familiarity (see Section
<xref ref-type="sec" rid="s1">2</xref>
). This behavior can be understood because, when learning Grammar 2, the model was able to “re-use” several non-terminal nodes as they were shaped by the training of the previous grammar. This was observed especially for lower level rules (e.g., node 36 encoding the fact that syllable a is likely to occur in 1st position; Figure
<xref ref-type="fig" rid="F8">8</xref>
C). The switch between the two grammars was obtained by modifying few critical non-terminals: for example, the Start node acquired a high probability to generate the (26,32) pair, which in turn correctly generates words according to grammar 2.</p>
<p>As a control, we performed simulations in which training started with a grammar in which A and C always occurred in position 1 and B and D always in position 3, but with no predictive association among them. As above, the switch to Grammar 2 was operated after 500 iterations. This time, the model failed to learn to discriminate between Grammars 1 and 2: likelihood for the two grammars rose nearly identically, and already during the initial training. This is because words generated according to each of the two grammars could also be generated by the unpredictive grammar, so that the switch to Grammar 2 is not sensed as an increase in novelty by the model, in an effect reminiscent of learned irrelevance.</p>
</sec>
</sec>
<sec sec-type="discussion">
<label>4</label>
<title>Discussion</title>
<p>Recently, structured probabilistic models have been drawing the attention of cognitive scientists (Chater et al.,
<xref ref-type="bibr" rid="B10">2006</xref>
), as a theoretical tool to explain several cognitive abilities ranging from, e.g., vision to language and motor control. On the other hand, because of their precise mathematical formulation, they may help envisioning how such high-level capabilities may be implemented in the brain. In this work we propose a possible avenue to implement SCFGs in the nervous system, and we show that the resulting theory can reproduce, at least in simple cases, many properties of semantic memory, in a unified framework. We sketch a possible outline of a neural system that is equivalent to the stochastic grammar, in the form of interacting modules, each representing a possible transition rule in the grammar. These computations may take place in cortical modules. We also show how new representations may be formed: learning the grammar-based model from examples is a complex optimization process, requiring global evaluation of probabilities over entire episodes. Here, we demonstrate that learning can be accomplished by a local algorithm only requiring, at each module, knowledge of quantities available at the module inputs. Grammar parameters are computed by a Monte-Carlo estimation, which requires random sampling of the correlations present in the episodic data. We propose that this is accomplished during sleep, through the replay of experience-related neural patterns.</p>
<sec>
<label>4.1</label>
<title>Theoretical advances</title>
<p>From a theoretical point of view, our approach starts from a standard SCFG, and the standard learning algorithm for these models, the inside–outside algorithm (Lari and Young,
<xref ref-type="bibr" rid="B37">1990</xref>
), but introduces two novel elements: first, our SCFG is a generative model for “episodes,” rather than strings (as its linguistic counterpart), with episodes described as sets of items with a given relational structure, expressed in the association matrix. Associations can embed spatio-temporal proximity, as well as similarity in other perceptual or cognitive dimensions. Association contributes to shaping the transition rules
<italic>a</italic>
(
<italic>i</italic>
,
<italic>j</italic>
,
<italic>k</italic>
), but sometimes they can enter in competition with them. This is the case for the examples in Figures
<xref ref-type="fig" rid="F4">4</xref>
and
<xref ref-type="fig" rid="F5">5</xref>
, where the best parsing clusters together items that are not the closest ones according to the association matrix
<italic>s</italic>
. Also, it is easy to see that the learning process can capture a link between “distant” items, if this link is consistent across many different episodes. In this sense, our grammar model, unlike string-based SCFGs, is sensitive to long-range correlations, alleviating one major drawback of SCFGs in computational linguistics. Making explicit these long-range, weak correlations is functionally equivalent to some types of insight phenomena that have been described in the memory consolidation literature (Wagner et al.,
<xref ref-type="bibr" rid="B84">2004</xref>
; Ellenbogen et al.,
<xref ref-type="bibr" rid="B16">2007</xref>
; Yordanova et al.,
<xref ref-type="bibr" rid="B88">2009</xref>
).</p>
<p>The generation of an association matrix also makes this model a valid starting point for grammar-based models of vision, especially as far as object recognition is concerned (Zhu et al.,
<xref ref-type="bibr" rid="B91">2007</xref>
; Savova and Tenenbaum,
<xref ref-type="bibr" rid="B63">2008</xref>
; Savova et al.,
<xref ref-type="bibr" rid="B62">2009</xref>
): the association matrix provides information about the relative positions of the constituents of an object. To enrich the representation of visual scenes, one possibility would be to include anchor points (e.g., at the top, bottom, left, and right of an object) among the items in the representation, which would endow the association matrix with information about the absolute spatial position of constituents. Theoretical work on Bayesian inference for object recognition and image segmentation has concentrated on “Part-based models” (see, e.g., Orbán et al.,
<xref ref-type="bibr" rid="B53">2008</xref>
). These are essentially two-level hierarchical models, that are subsumed for the most part by our approach. The added value of grammar-based models like ours, however, would be to provide better descriptions of situations in which the part structure of an object is not fixed (Savova and Tenenbaum,
<xref ref-type="bibr" rid="B63">2008</xref>
; unlike, for example, a face), and a multi-level hierarchy is apparent(Ullman,
<xref ref-type="bibr" rid="B82">2007</xref>
).</p>
<p>A second novel contribution concerns the learning dynamics: we demonstrate that the association matrix-enriched SCFG can be trained by a modified inside–outside algorithm (Lari and Young,
<xref ref-type="bibr" rid="B37">1990</xref>
), where the cumbersome E-step can be performed by Monte-Carlo estimation (Wei and Tanner,
<xref ref-type="bibr" rid="B85">1990</xref>
). In this form, the learning algorithm has a local expression. Inside–outside algorithms involve summation over a huge number of trees spanning the entire episode. Here, however, we show that all the global contributions can be subsumed, for each tree sampled in the Monte-Carlo process, by a total tree probability term (the “semantic strength” of Eq.
<xref ref-type="disp-formula" rid="E9">7</xref>
), which can be computed at the root node and propagated down the network. As we will discuss below, this reformulation of the model is crucial for mapping the algorithm on the anatomy and physiology of the brain.</p>
<p>Further, there are some similarities between our work and neural network implementations of stochastic grammars for linguistic material (Borensztajn et al.,
<xref ref-type="bibr" rid="B9">2009</xref>
); however, those models deal specifically with linguistic material, and learning is based on only one parsing (the Maximum Likelihood one). In our approach, all possible parsings of an episode are in principle considered, and learning may capture relevant aspects of an episode that are represented in suboptimal parsings only.</p>
<p>Another important feature of the model is that learning takes automatically into account the “novelty” of an episode, by modulating the learning rate (Eq.
<xref ref-type="disp-formula" rid="E2">2</xref>
) by the likelihood assigned to the entire episode
<italic>E</italic>
(
<bold>O</bold>
<sup>(
<italic>n</italic>
)</sup>
), given the current state of the transition matrix. This guarantees sensitivity to sudden changes in the environment (Yu and Dayan,
<xref ref-type="bibr" rid="B89">2005</xref>
), and can explain phenomena like learned irrelevance in complex situations as that described in Figure
<xref ref-type="fig" rid="F8">8</xref>
B.</p>
<p>In the current framework, we model memory retrieval in two different ways: first, the episode likelihood
<italic>E</italic>
(
<bold>O</bold>
<sup>(
<italic>n</italic>
)</sup>
) provides an indication of the familiarity of a new presented episode. the episode likelihood may also be used to evaluate different completions of episodes in which some of the items are not explicitly presented to the model: All possible alternatives can be compared, and the one with the highest likelihood can be considered as the model's guess. The absolute value of the likelihood quantifies the episode familiarity, given the current state of the model. Second, computing the best parse tree allows to determine what are the likely latent “causes” of the episode, which correspond to the non-terminal nodes participating in the parse (Figures
<xref ref-type="fig" rid="F4">4</xref>
C,
<xref ref-type="fig" rid="F5">5</xref>
C,
<xref ref-type="fig" rid="F6">6</xref>
D, and
<xref ref-type="fig" rid="F8">8</xref>
C). The number of parsing trees contributing significantly to the probability mass (or, better, the entropy of the probability distribution induced by the grammar mode over the trees spanning the episode) provides a measure of the uncertainty in episode interpretation. Because the model performs full Bayesian inference, probability of latent causes is computed optimally at all levels in the trees, as is the case in simpler Bayesian models (Ernst and Banks,
<xref ref-type="bibr" rid="B17">2002</xref>
; Kording and Wolpert,
<xref ref-type="bibr" rid="B35">2004</xref>
). In particular the full probability distribution over parsing trees (representing possible causes of a episode) is optimized offline, and is available at the time an episode is presented, under the form of neural activity levels.</p>
<p>Clearly, trees are only a partially adequate representation of semantic representations, and in many cases, other structures are more appropriate (Kemp and Tenenbaum,
<xref ref-type="bibr" rid="B33">2008</xref>
). Here, we limited ourselves to trees because of the availability of relatively simple and computationally affordable algorithms from Computational Linguistics. However, generalization to other, more powerful graph structures, in particular, those containing loops, while technically challenging, is theoretically straightforward, and may be accomplished with the same basic assumptions about neocortical modules and Monte-Carlo sampling-based training.</p>
</sec>
<sec>
<label>4.2</label>
<title>Relevance for semantic memory and consolidation</title>
<p>What we describe here amounts to an interaction between two memory systems, with two profoundly different organizational principles: on the one hand an event-based relational code, on the other side a structured system of relationships between items, which can be recollected in a flexible way and contribute to interpretation and prediction of future occurrences. In psychological terms, these correspond to episodic and semantic memory, respectively. The distinction between these two subdivisions of declarative memory is paralleled by the division of labor between archicortex (that is, the hippocampus and associated transitional cortices) and neocortex. Memory acquisition depends critically on the hippocampus (Scoville and Milner,
<xref ref-type="bibr" rid="B65">1957</xref>
; Marr,
<xref ref-type="bibr" rid="B43">1971</xref>
; Bayley et al.,
<xref ref-type="bibr" rid="B4">2005</xref>
; Moscovitch et al.,
<xref ref-type="bibr" rid="B49">2006</xref>
), this dependency fades with time, while at the same time the involvement of the neocortex increases. The basic tenet of systems consolidation theory is that this shift in memory location in the brain reflects a real re-organization of the synaptic underpinnings of memory, with transfer of information between the areas, in addition to changes at the molecular and synaptic levels. The result of this process is a stabilized memory, resistant to damage of the MTL. However, not all memory are consolidated alike: there is a lively debate on whether episodic memory ever becomes completely hippocampally independent, while semantic memory consolidates faster (Bayley et al.,
<xref ref-type="bibr" rid="B4">2005</xref>
; Moscovitch et al.,
<xref ref-type="bibr" rid="B49">2006</xref>
).</p>
<p>Our model addresses some aspects of the information transfer related to systems consolidation: in this sense, the model represents an evolution of theories of “dual memory systems” (McClelland et al.,
<xref ref-type="bibr" rid="B45">1995</xref>
). As in these theories, there are two distinct modules for episodic and semantic memory respectively. Thus, our model would predict, similar to McClelland et al. (
<xref ref-type="bibr" rid="B45">1995</xref>
), that without the “hippocampus” module the semantic model could not be updated, but would not be impaired (thus similar to the hippocampally lesioned amnesic patients), and that without the semantic module, the ability to retain episodic information would be retained (similar to semantic dementia patients). However, there are important differences with these previous models: first, in the current theory, the highly structured nature of semantic knowledge is an emergent property of a single mathematical framework (a modified SCFG); because of this, as we showed with multiple simulations, the same model can account for a wide variety of phenomena suggesting that, with consolidation, a real representational change takes place. As shown in Figure
<xref ref-type="fig" rid="F4">4</xref>
, our model is capable of effectively producing context-independent representations of stimulus configurations, which may encode a complex object (which could then, for example, be disambiguated from a background) or an association between a conditioned and unconditioned stimulus, as in, e.g., fear conditioning. These associations can then be recognized even when they are presented in a corrupted version (e.g., in a different context), the evidence for them represented by weak or long-range correlations. This context independence is intrinsic to the tree-like nature of the representations we assume for semantic memory, and is an observed feature of consolidated memories (Winocur et al.,
<xref ref-type="bibr" rid="B87">2007</xref>
). As this new representation gains strength relative to the context-dependent representation supposed to be stored in the hippocampal formation, it may become more and more important in driving behavior, explaining some of the known behavioral effects.</p>
<p>We elaborated two further examples of the ability of the model to generalize from data across multiple episodes. Figure
<xref ref-type="fig" rid="F5">5</xref>
shows that categories may be formed from items that occur in the same-context. This requires detection of correlations that span long intervals of time (multiple episodes) and multiple steps [items (4,5) → context → items (6,7)]. The prediction that such a mechanism of category formation depends on systems consolidation can be tested in future experiments. Furthermore, this capacity is recognized as an important element of language acquisition (Tomasello,
<xref ref-type="bibr" rid="B77">2005</xref>
). Recognition of weak and long-range correlation also lies at the basis of a number of insight-like phenomena that are favored by consolidation and by sleep in particular (Wagner et al.,
<xref ref-type="bibr" rid="B84">2004</xref>
; Ellenbogen et al.,
<xref ref-type="bibr" rid="B16">2007</xref>
; Yordanova et al.,
<xref ref-type="bibr" rid="B88">2009</xref>
).</p>
<p>Figure
<xref ref-type="fig" rid="F8">8</xref>
shows, in a simple case, that the grammar model extracted from previous examples can be used as a schema to facilitate the acquisition of further information sharing the same structure. From acquisition of the first artificial grammar, the model learns that the first syllable predicts the third syllable, and learns about which syllables are likely to occur in any position. Once this information has been learned (which requires a consolidation interval, corresponding to multiple iterations of the inside–outside algorithm), learning of the second grammar, with the same structure as the first, but swapped tokens, can take place much faster. This is demonstrated in the model by the increased likelihood assigned by the model to configurations produced according to the “swapped” grammar, a few iterations after this has been presented for the first time. In behavioral terms, the increased likelihood would correspond to increased familiarity and probability of recognition. This parallels the experimental finding by Gómez et al. (
<xref ref-type="bibr" rid="B23">2006</xref>
) that after a sleep period children are more likely to acquire and recognize the second grammar. In the experiment, children that did not sleep could not acquire the second grammar, but are still able to recognize the original grammar. In our simulations, this latter situation corresponds to presenting the second grammar after only a few iterations of the learning algorithm. Under these conditions, in fact, learning of the second grammar would be as slow as learning of the original grammar, while this already has an enhanced likelihood, supporting later familiarity and recognition. In our model, we could reproduce this behavior with few extra assumptions, except for the notion of the place in the sequence, which is key to the target grammar, and that here has been simulated with the “slot number” items. An alternative approach would have been to use an off-diagonal association matrix (associations are non-zero only for consecutive items in the sequences), which would make our model equivalent to a standard SCFG as used in linguistics. Connectionist approaches may also solve the same task, for example some modification of the model by Dienes et al. (
<xref ref-type="bibr" rid="B15">1999</xref>
). In this model, a “mapping layer” is introduced in order to separate the abstract grammar from the identity of the tokens (which is the problem here). However, we find this a much heavier
<italic>ad hoc</italic>
assumption than what we need for modeling this result in our framework. In the experimental literature, some initial results have begun to link slow-wave sleep neural activity with inductive reasoning as well, of which we think our result of Figure
<xref ref-type="fig" rid="F5">5</xref>
represents an, admittedly simplistic, example (Schabus et al.,
<xref ref-type="bibr" rid="B64">2006</xref>
). In this work, the authors show that sleep spindle magnitude is correlated across subjects, with performance in the Advanced Progressive Matrices test of inductive reasoning. Subjects showing the strongest sleep spindles are the ones that perform best in this task, showing that neural activity during sleep could be important to shape the brain networks supporting these abilities. Our simulations provide a possible model of how this may come about.</p>
<p>It is important here to stress that, while each example here could be addressed by simpler models, our model can reproduce all these features of semantic memory in a single framework. In fact, the examples that we present underrepresent the power of this model, mostly due to the difficulty of simulating this model on a regular (serial) computer. But this framework can in principle deal with more complex situations: a more general prediction of the model is that systems consolidation is especially important for the memorization of multi-level hierarchical structures. This can be tested directly, but, to our knowledge, evidence on this is not available. Larger simulations of the model will be the subject of future work.</p>
<p>A second difference with previous dual memory system models (McClelland et al.,
<xref ref-type="bibr" rid="B45">1995</xref>
) is that these maintain that slower learning processes in the cerebral cortex would be necessary to prevent the sudden breakdown of memory because of storage overload. Our model suggests a different function for the existence of a fast learner (i.e., the hippocampus) and a slow learner (i.e., the neocortex): slow learning, and long consolidation intervals are needed because the semantic memory module learns by exploring all the possible parsings of a certain episode, a task which requires examining a combinatorially huge number of configurations. This is accomplished by a Monte-Carlo optimization, a slow process, driven by randomness, which ensures the exploration of the complex phase space on which the probability distribution for the model parameters is defined. Spontaneous activity during sleep, including noisy replay of experience-related neural patterns, could be a source of such randomness (for a related perspective on the role of sleep in consolidation see Derégnaucourt et al.,
<xref ref-type="bibr" rid="B14">2005</xref>
). Because of this, slow learning is needed in our model even when the memory load is too low to cause catastrophic interference. The need for Monte-Carlo optimization here arises from the fact that learning entails sampling of all possible parsing trees (a combinatorially huge number). This makes this architecture harder to train than models based on mean-field approximations (like Boltzmann machines). In our opinion, the payoff for this higher complexity is the ability of detecting hidden structure (which would correspond to low probability parsings, or causes for a single episode) when this structure is repeated over many examples.</p>
<p>Because of the highly structured nature of the underlying generative model, our framework differs markedly from other connectionist approaches to semantic memory (Rogers and McClelland,
<xref ref-type="bibr" rid="B60">2004</xref>
) as well: for example, Boltzmann machines (Hinton et al.,
<xref ref-type="bibr" rid="B26">1985</xref>
) have been used to model memory consolidation and semantic memory formation (Kali and Dayan,
<xref ref-type="bibr" rid="B32">2004</xref>
), however, while the mean-field nature of the learning algorithm there makes training faster, these models can reproduce semantic memory formation only in simple cases.</p>
<p>There are several aspects of systems consolidation that our model does not cover. Importantly, we do not model explicitly episodic memory formation, either in the acquisition or in the consolidation phase. These are aspects that can be dealt with in future extensions of this framework: the acquisition of a new episodic memory involves the activation of several existing semantic representations that are linked together to form a new configuration (McClelland et al.,
<xref ref-type="bibr" rid="B45">1995</xref>
). While in the present form of the model episodes are depicted as collections of “atomic” items, whose further structure is not discussed, it is possible to equate these items to pointers to existing semantic representations (transition rules in the SCFG). This would allow the model to learn hierarchies in multiple steps, in which representations at one level are recursively linked together to form representations at the next level. Moreover, due to the Markov property of the underlying stochastic branching process, our grammar model cannot store episodic memories, and therefore cannot model its consolidation. This would become possible if more sophisticated computational grammars are used, such as data-oriented parsing (DOP, Bod et al.,
<xref ref-type="bibr" rid="B8">1991</xref>
), which store in memory the probability of entire previously experienced trees and sub-trees instead of just transition rules.</p>
</sec>
<sec>
<label>4.3</label>
<title>Analogies with cortical circuitry and function</title>
<p>Several theorists have proposed that hierarchical Bayesian inference can be performed in the cerebral cortex, by interconnected similar modules that perform the same basic operation (see, e.g., Rao and Ballard,
<xref ref-type="bibr" rid="B57">1999</xref>
; Friston,
<xref ref-type="bibr" rid="B21">2003</xref>
; Lee and Mumford,
<xref ref-type="bibr" rid="B39">2003</xref>
; George and Hawkins,
<xref ref-type="bibr" rid="B22">2009</xref>
), in most cases with a particular focus on sensory – particularly visual – processing. In these models, each module roughly corresponds to a cortical (micro)column, and receives both bottom-up and top-down information. Information is then passed to the upper levels of the hierarchy. The model by George and Hawkins (
<xref ref-type="bibr" rid="B22">2009</xref>
), which proposes a related scheme for hierarchical Bayesian inference operated in the neocortex assumes, as we do (Figure
<xref ref-type="fig" rid="F3">3</xref>
A), that the first stage of each module is an array of coincidence detectors taking inputs from modules at the lower level in the hierarchy, which they identify with cortical layer 4. As in our model, further stages (in the superficial layers) combine the results from these detectors to produce the output to dispatch to the further levels (through the deep layers). Many of the assumptions in this work about how these basic operations may be implemented in a cortical module hold for our model as well. However, modules inspired to the sensory (e.g., visual) system have necessarily a very rigid structure, for example with larger receptive fields at higher levels, collecting inputs from units with smaller subfields, down in the hierarchy. In our case, we make less assumptions about connectivity: in the idealized version of the model we presented, each node can be a child of any other node. This may be a better fit to the anatomy of frontal cortices, main anatomical substrates of semantic memory. Similar to George and Hawkins (
<xref ref-type="bibr" rid="B22">2009</xref>
), it is possible that each cortical micro-circuit is equivalent to a number of different nodes in the formal model, with the nodes activating competitively. Because of the elevated intrinsic connectivity, cortical columns are a natural substrate for this kind of representation. Within each micro-circuit the representation of each node may be distributed, with feedback playing an important role in the computation. In order to make this architecture robust against brain injury, one could assume that, similar in spirit to Nadel and Moscovitch's multiple trace theory (Moscovitch et al.,
<xref ref-type="bibr" rid="B49">2006</xref>
), the same representations are repeated in multiple modules. On the other hand, one can also envision completely different implementations of the model, in which nodes are implemented as attractor states distributed across the neocortex, and parsing analysis corresponding to transitions between attractors (Treves,
<xref ref-type="bibr" rid="B78">2005</xref>
) which would correspond to traversing the parsing tree (Borensztajn et al.,
<xref ref-type="bibr" rid="B9">2009</xref>
).</p>
<p>In our framework, we assume that inputs to the semantic networks come from an episodic memory module, which encodes associations as activity correlations. The hippocampus fits this role well: the recurrent synaptic matrix in CA3 is ideally suited to encode associations (Treves and Rolls,
<xref ref-type="bibr" rid="B79">1994</xref>
), which may be reflected in activity correlations between pairs of hippocampal neurons. It has been shown experimentally (Wilson and McNaughton,
<xref ref-type="bibr" rid="B86">1994</xref>
) that these activity correlations are replayed during sleep and preserve temporal ordering of cell pair activation (Skaggs and McNaughton,
<xref ref-type="bibr" rid="B71">1996</xref>
). Higher-order momenta of the activity distribution (corresponding to the distribution of
<bold>O</bold>
<sup>(
<italic>n</italic>
)</sup>
) are also encoded in replayed activity (Lee and Wilson,
<xref ref-type="bibr" rid="B38">2002</xref>
; Euston et al.,
<xref ref-type="bibr" rid="B18">2007</xref>
; Peyrache et al.,
<xref ref-type="bibr" rid="B54">2009</xref>
). As we showed in Figure
<xref ref-type="fig" rid="F6">6</xref>
, this is sufficient to enable the semantic model to extract information about the order of complete sequences: the most likely parsing trees combine these pairwise relationships correctly so that they yield the correct global order.</p>
<p>In this model, learning makes use of plasticity at the level of synapses from coincidence detectors to the output unit in each module. This plasticity is driven by correlation between pre- and post-synaptic activity, including Spike-Timing-Dependent Plasticity (Bi and Poo,
<xref ref-type="bibr" rid="B5">1998</xref>
), and is modulated by a top-down signal. For each replayed tree, this top-down signal is computed at the root of the tree, corresponding to high order (possibly prefrontal) cortical areas. This signal includes information about the novelty of the parsed episode
<italic>E</italic>
(
<bold>O</bold>
<sup>(
<italic>n</italic>
)</sup>
), which could be carried by neuromodulatory influences, for example dopaminergic, cholinergic, and noradrenergic (Yu and Dayan,
<xref ref-type="bibr" rid="B89">2005</xref>
). We propose two scenarios under which this may take place during sleep. In the first, novelty is computed and translated in changes in the neuromodulatory state during wakefulness and this affects the strength of the encoding of an episode. This would “tag” the episode and determine greater replay probabilities for novel episodes (Cheng and Frank,
<xref ref-type="bibr" rid="B11">2008</xref>
; O'Neill et al.,
<xref ref-type="bibr" rid="B52">2008</xref>
), which in our framework is equivalent to modulating the episodic strengths for the trees
<italic>E</italic>
(
<italic>t</italic>
) (Eq.
<xref ref-type="disp-formula" rid="E6">6</xref>
). Alternatively, novelty signals could be expressed during sleep, by replay of the activity in the neuromodulatory structures themselves, as has been observed in the dopaminergic system (Valdes et al.,
<xref ref-type="bibr" rid="B83">2008</xref>
).</p>
</sec>
</sec>
<sec>
<title>Conflict of Interest Statement</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
</body>
<back>
<ack>
<p>Funding was provided by NWO-VICI grant 918.46.609 to Cyriel M. A. Pennartz. We thank Rens Bod and Gideon Borensztajn for inspiring discussions.</p>
</ack>
<ref-list>
<title>References</title>
<ref id="B1">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ambros-Ingerson</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Granger</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Lynch</surname>
<given-names>G.</given-names>
</name>
</person-group>
(
<year>1990</year>
).
<article-title>Simulation of paleocortex performs hierarchical clustering</article-title>
.
<source>Science</source>
<volume>247</volume>
,
<fpage>1344</fpage>
<lpage>1348</lpage>
<pub-id pub-id-type="doi">10.1126/science.2315702</pub-id>
<pub-id pub-id-type="pmid">2315702</pub-id>
</mixed-citation>
</ref>
<ref id="B2">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Baldi</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Chauvin</surname>
<given-names>Y.</given-names>
</name>
</person-group>
(
<year>1994</year>
).
<article-title>Smooth on-line learning algorithms for hidden markov models</article-title>
.
<source>Neural Comput.</source>
<volume>6</volume>
,
<fpage>307</fpage>
<lpage>318</lpage>
<pub-id pub-id-type="doi">10.1162/neco.1994.6.2.307</pub-id>
</mixed-citation>
</ref>
<ref id="B3">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Battaglia</surname>
<given-names>F. P.</given-names>
</name>
<name>
<surname>Sutherland</surname>
<given-names>G. R.</given-names>
</name>
<name>
<surname>McNaughton</surname>
<given-names>B. L.</given-names>
</name>
</person-group>
(
<year>2004</year>
).
<article-title>Hippocampal sharp wave bursts coincide with neocortical “up-state” transitions</article-title>
.
<source>Learn. Mem.</source>
<volume>11</volume>
,
<fpage>697</fpage>
<lpage>704</lpage>
<pub-id pub-id-type="doi">10.1101/lm.73504</pub-id>
<pub-id pub-id-type="pmid">15576887</pub-id>
</mixed-citation>
</ref>
<ref id="B4">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bayley</surname>
<given-names>P. J.</given-names>
</name>
<name>
<surname>Gold</surname>
<given-names>J. J.</given-names>
</name>
<name>
<surname>Hopkins</surname>
<given-names>R. O.</given-names>
</name>
<name>
<surname>Squire</surname>
<given-names>L. R.</given-names>
</name>
</person-group>
(
<year>2005</year>
).
<article-title>The neuroanatomy of remote memory</article-title>
.
<source>Neuron</source>
<volume>46</volume>
,
<fpage>799</fpage>
<lpage>810</lpage>
<pub-id pub-id-type="doi">10.1016/j.neuron.2005.04.034</pub-id>
<pub-id pub-id-type="pmid">15924865</pub-id>
</mixed-citation>
</ref>
<ref id="B5">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bi</surname>
<given-names>G. Q.</given-names>
</name>
<name>
<surname>Poo</surname>
<given-names>M. M.</given-names>
</name>
</person-group>
(
<year>1998</year>
).
<article-title>Synaptic modifications in cultured hippocampal neurons: dependence on spike timing, synaptic strength, and postsynaptic cell type</article-title>
.
<source>J. Neurosci.</source>
<volume>18</volume>
,
<fpage>10464</fpage>
<lpage>10472</lpage>
<pub-id pub-id-type="pmid">9852584</pub-id>
</mixed-citation>
</ref>
<ref id="B6">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bod</surname>
<given-names>R.</given-names>
</name>
</person-group>
(
<year>2002</year>
).
<article-title>A unified model of structural organization in language and music</article-title>
.
<source>J. Artif. Intell. Res.</source>
<volume>17</volume>
,
<fpage>289</fpage>
<lpage>308</lpage>
</mixed-citation>
</ref>
<ref id="B7">
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Bod</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Hay</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Jannedy</surname>
<given-names>S.</given-names>
</name>
</person-group>
(
<year>2003</year>
).
<source>Probabilistic Linguistics</source>
.
<publisher-loc>Cambridge, MA</publisher-loc>
:
<publisher-name>MIT Press</publisher-name>
</mixed-citation>
</ref>
<ref id="B8">
<mixed-citation publication-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Bod</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Scha</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Sima'an</surname>
<given-names>K.</given-names>
</name>
</person-group>
(
<year>1991</year>
).
<article-title>“Data-oriented parsing,”</article-title>
in
<conf-name>Proceedings of Computational Linguistics</conf-name>
,
<conf-loc>Amsterdam</conf-loc>
,
<fpage>26</fpage>
<lpage>39</lpage>
</mixed-citation>
</ref>
<ref id="B9">
<mixed-citation publication-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Borensztajn</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Zuidema</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Bod</surname>
<given-names>R.</given-names>
</name>
</person-group>
(
<year>2009</year>
).
<article-title>“The hierarchical prediction network: towards a neural theory of grammar acquisition,”</article-title>
in
<conf-name>31th Annual Conference of the Cognitive Science Society</conf-name>
,
<conf-loc>Amsterdam</conf-loc>
</mixed-citation>
</ref>
<ref id="B10">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chater</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Tenenbaum</surname>
<given-names>J. B.</given-names>
</name>
<name>
<surname>Yuille</surname>
<given-names>A.</given-names>
</name>
</person-group>
(
<year>2006</year>
).
<article-title>Probabilistic models of cognition: conceptual foundations</article-title>
.
<source>Trends Cogn. Sci. (Regul. Ed.)</source>
<volume>10</volume>
,
<fpage>287</fpage>
<lpage>291</lpage>
<pub-id pub-id-type="doi">10.1016/j.tics.2006.05.007</pub-id>
<pub-id pub-id-type="pmid">16807064</pub-id>
</mixed-citation>
</ref>
<ref id="B11">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cheng</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Frank</surname>
<given-names>L. M.</given-names>
</name>
</person-group>
(
<year>2008</year>
).
<article-title>New experiences enhance coordinated neural activity in the hippocampus</article-title>
.
<source>Neuron</source>
<volume>57</volume>
,
<fpage>303</fpage>
<lpage>313</lpage>
<pub-id pub-id-type="doi">10.1016/j.neuron.2007.11.035</pub-id>
<pub-id pub-id-type="pmid">18215626</pub-id>
</mixed-citation>
</ref>
<ref id="B12">
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Cohen</surname>
<given-names>N. J.</given-names>
</name>
<name>
<surname>Eichenbaum</surname>
<given-names>H.</given-names>
</name>
</person-group>
(
<year>1993</year>
).
<source>Memory, Amnesia, and the Hippocampal System</source>
.
<publisher-loc>Cambridge, MA</publisher-loc>
:
<publisher-name>MIT Press</publisher-name>
</mixed-citation>
</ref>
<ref id="B13">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Courville</surname>
<given-names>A. C.</given-names>
</name>
<name>
<surname>Daw</surname>
<given-names>N. D.</given-names>
</name>
<name>
<surname>Touretzky</surname>
<given-names>D. S.</given-names>
</name>
</person-group>
(
<year>2006</year>
).
<article-title>Bayesian theories of conditioning in a changing world</article-title>
.
<source>Trends Cogn. Sci. (Regul. Ed.)</source>
<volume>10</volume>
,
<fpage>294</fpage>
<lpage>300</lpage>
<pub-id pub-id-type="doi">10.1016/j.tics.2006.05.004</pub-id>
<pub-id pub-id-type="pmid">16793323</pub-id>
</mixed-citation>
</ref>
<ref id="B14">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Derégnaucourt</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Mitra</surname>
<given-names>P. P.</given-names>
</name>
<name>
<surname>Fehér</surname>
<given-names>O.</given-names>
</name>
<name>
<surname>Pytte</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Tchernichovski</surname>
<given-names>O.</given-names>
</name>
</person-group>
(
<year>2005</year>
).
<article-title>How sleep affects the developmental learning of bird song</article-title>
.
<source>Nature</source>
<volume>433</volume>
,
<fpage>710</fpage>
<lpage>716</lpage>
<pub-id pub-id-type="doi">10.1038/nature03275</pub-id>
<pub-id pub-id-type="pmid">15716944</pub-id>
</mixed-citation>
</ref>
<ref id="B15">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dienes</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Altmann</surname>
<given-names>G. T.</given-names>
</name>
<name>
<surname>Gao</surname>
<given-names>S.-J. J.</given-names>
</name>
</person-group>
(
<year>1999</year>
).
<article-title>Mapping across domains without feedback: a neural network model of transfer of implicit knowledge</article-title>
.
<source>Cogn. Sci.</source>
<volume>23</volume>
,
<fpage>53</fpage>
<lpage>82</lpage>
<pub-id pub-id-type="doi">10.1207/s15516709cog2301_3</pub-id>
</mixed-citation>
</ref>
<ref id="B16">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ellenbogen</surname>
<given-names>J. M.</given-names>
</name>
<name>
<surname>Hu</surname>
<given-names>P. T.</given-names>
</name>
<name>
<surname>Payne</surname>
<given-names>J. D.</given-names>
</name>
<name>
<surname>Titone</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Walker</surname>
<given-names>M. P.</given-names>
</name>
</person-group>
(
<year>2007</year>
).
<article-title>Human relational memory requires time and sleep</article-title>
.
<source>Proc. Natl. Acad. Sci. U.S.A.</source>
<volume>104</volume>
,
<fpage>7723</fpage>
<lpage>7728</lpage>
<pub-id pub-id-type="doi">10.1073/pnas.0700094104</pub-id>
<pub-id pub-id-type="pmid">17449637</pub-id>
</mixed-citation>
</ref>
<ref id="B17">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ernst</surname>
<given-names>M. O.</given-names>
</name>
<name>
<surname>Banks</surname>
<given-names>M. S.</given-names>
</name>
</person-group>
(
<year>2002</year>
).
<article-title>Humans integrate visual and haptic information in a statistically optimal fashion</article-title>
.
<source>Nature</source>
<volume>415</volume>
,
<fpage>429</fpage>
<lpage>433</lpage>
<pub-id pub-id-type="doi">10.1038/415429a</pub-id>
<pub-id pub-id-type="pmid">11807554</pub-id>
</mixed-citation>
</ref>
<ref id="B18">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Euston</surname>
<given-names>D. R.</given-names>
</name>
<name>
<surname>Tatsuno</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>McNaughton</surname>
<given-names>B. L.</given-names>
</name>
</person-group>
(
<year>2007</year>
).
<article-title>Fast-forward playback of recent memory sequences in prefrontal cortex during sleep</article-title>
.
<source>Science</source>
<volume>318</volume>
,
<fpage>1147</fpage>
<lpage>1150</lpage>
<pub-id pub-id-type="doi">10.1126/science.1148979</pub-id>
<pub-id pub-id-type="pmid">18006749</pub-id>
</mixed-citation>
</ref>
<ref id="B19">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Felleman</surname>
<given-names>D. J.</given-names>
</name>
<name>
<surname>Van Essen</surname>
<given-names>D. C.</given-names>
</name>
</person-group>
(
<year>1991</year>
).
<article-title>Distributed hierarchical processing in the primate cerebral cortex</article-title>
.
<source>Cereb. Cortex</source>
<volume>1</volume>
,
<fpage>1</fpage>
<lpage>47</lpage>
<pub-id pub-id-type="doi">10.1093/cercor/1.1.1</pub-id>
<pub-id pub-id-type="pmid">1822724</pub-id>
</mixed-citation>
</ref>
<ref id="B20">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Frankland</surname>
<given-names>P. W.</given-names>
</name>
<name>
<surname>Bontempi</surname>
<given-names>B.</given-names>
</name>
</person-group>
(
<year>2005</year>
).
<article-title>The organization of recent and remote memories</article-title>
.
<source>Nat. Rev. Neurosci.</source>
<volume>6</volume>
,
<fpage>119</fpage>
<lpage>130</lpage>
<pub-id pub-id-type="doi">10.1038/nrn1607</pub-id>
<pub-id pub-id-type="pmid">15685217</pub-id>
</mixed-citation>
</ref>
<ref id="B21">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Friston</surname>
<given-names>K.</given-names>
</name>
</person-group>
(
<year>2003</year>
).
<article-title>Learning and inference in the brain</article-title>
.
<source>Neural Netw.</source>
<volume>16</volume>
,
<fpage>1325</fpage>
<lpage>1352</lpage>
<pub-id pub-id-type="doi">10.1016/j.neunet.2003.06.005</pub-id>
<pub-id pub-id-type="pmid">14622888</pub-id>
</mixed-citation>
</ref>
<ref id="B22">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>George</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Hawkins</surname>
<given-names>J.</given-names>
</name>
</person-group>
(
<year>2009</year>
).
<article-title>Towards a mathematical theory of cortical micro-circuits</article-title>
.
<source>PLoS Comput. Biol.</source>
<volume>5</volume>
,
<fpage>e1000532</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pcbi.1000532</pub-id>
<pub-id pub-id-type="pmid">19816557</pub-id>
</mixed-citation>
</ref>
<ref id="B23">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gómez</surname>
<given-names>R. L.</given-names>
</name>
<name>
<surname>Bootzin</surname>
<given-names>R. R.</given-names>
</name>
<name>
<surname>Nadel</surname>
<given-names>L.</given-names>
</name>
</person-group>
(
<year>2006</year>
).
<article-title>Naps promote abstraction in language-learning infants</article-title>
.
<source>Psychol. Sci.</source>
<volume>17</volume>
,
<fpage>670</fpage>
<lpage>674</lpage>
<pub-id pub-id-type="doi">10.1111/j.1467-9280.2006.01764.x</pub-id>
<pub-id pub-id-type="pmid">16913948</pub-id>
</mixed-citation>
</ref>
<ref id="B24">
<mixed-citation publication-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Goodman</surname>
<given-names>J.</given-names>
</name>
</person-group>
(
<year>1996</year>
).
<article-title>“Parsing algorithms and metrics,”</article-title>
in
<conf-name>Annual Meeting-Association for Computational Linguistics</conf-name>
,
<conf-loc>Philadelphia, PA</conf-loc>
, Vol.
<volume>34</volume>
,
<fpage>177</fpage>
<lpage>183</lpage>
</mixed-citation>
</ref>
<ref id="B25">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Goodman</surname>
<given-names>N. D.</given-names>
</name>
<name>
<surname>Tenenbaum</surname>
<given-names>J. B.</given-names>
</name>
<name>
<surname>Feldman</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Griffiths</surname>
<given-names>T. L.</given-names>
</name>
</person-group>
(
<year>2008</year>
).
<article-title>A rational analysis of rule-based concept learning</article-title>
.
<source>Cogn. Sci.</source>
<volume>32</volume>
,
<fpage>108</fpage>
<lpage>154</lpage>
<pub-id pub-id-type="doi">10.1080/03640210701802071</pub-id>
<pub-id pub-id-type="pmid">21635333</pub-id>
</mixed-citation>
</ref>
<ref id="B26">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hinton</surname>
<given-names>G. E.</given-names>
</name>
<name>
<surname>Sejnowski</surname>
<given-names>T. J.</given-names>
</name>
<name>
<surname>Ackley</surname>
<given-names>D. H.</given-names>
</name>
</person-group>
(
<year>1985</year>
).
<article-title>Boltzmann machines: constraint satisfaction networks that learn</article-title>
.
<source>Cogn. Sci.</source>
<volume>9</volume>
,
<fpage>147</fpage>
<lpage>169</lpage>
<pub-id pub-id-type="doi">10.1207/s15516709cog0901_7</pub-id>
</mixed-citation>
</ref>
<ref id="B27">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hoffman</surname>
<given-names>K. L.</given-names>
</name>
<name>
<surname>McNaughton</surname>
<given-names>B. L.</given-names>
</name>
</person-group>
(
<year>2002</year>
).
<article-title>Coordinated reactivation of distributed memory traces in primate neocortex</article-title>
.
<source>Science</source>
<volume>297</volume>
,
<fpage>2070</fpage>
<lpage>2073</lpage>
<pub-id pub-id-type="doi">10.1126/science.1072640</pub-id>
<pub-id pub-id-type="pmid">12242447</pub-id>
</mixed-citation>
</ref>
<ref id="B28">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hopfield</surname>
<given-names>J. J.</given-names>
</name>
</person-group>
(
<year>1982</year>
).
<article-title>Neural networks and physical systems with emergent collective computational abilities</article-title>
.
<source>Proc. Natl. Acad. Sci. U.S.A.</source>
<volume>79</volume>
,
<fpage>2554</fpage>
<lpage>2558</lpage>
<pub-id pub-id-type="doi">10.1073/pnas.79.8.2554</pub-id>
<pub-id pub-id-type="pmid">6953413</pub-id>
</mixed-citation>
</ref>
<ref id="B29">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Isomura</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Sirota</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Ozen</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Montgomery</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Mizuseki</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Henze</surname>
<given-names>D. A.</given-names>
</name>
<name>
<surname>Buzsáki</surname>
<given-names>G.</given-names>
</name>
</person-group>
(
<year>2006</year>
).
<article-title>Integration and segregation of activity in entorhinal-hippocampal subregions by neocortical slow oscillations</article-title>
.
<source>Neuron</source>
<volume>52</volume>
,
<fpage>871</fpage>
<lpage>882</lpage>
<pub-id pub-id-type="doi">10.1016/j.neuron.2006.10.023</pub-id>
<pub-id pub-id-type="pmid">17145507</pub-id>
</mixed-citation>
</ref>
<ref id="B30">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ji</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Wilson</surname>
<given-names>M. A.</given-names>
</name>
</person-group>
(
<year>2007</year>
).
<article-title>Coordinated memory replay in the visual cortex and hippocampus during sleep</article-title>
.
<source>Nat. Neurosci.</source>
<volume>10</volume>
,
<fpage>100</fpage>
<lpage>107</lpage>
<pub-id pub-id-type="doi">10.1038/nn1825</pub-id>
<pub-id pub-id-type="pmid">17173043</pub-id>
</mixed-citation>
</ref>
<ref id="B31">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kahana</surname>
<given-names>M. J.</given-names>
</name>
</person-group>
(
<year>1996</year>
).
<article-title>Associative retrieval processes in free recall</article-title>
.
<source>Mem. Cognit.</source>
<volume>24</volume>
,
<fpage>103</fpage>
<lpage>109</lpage>
<pub-id pub-id-type="doi">10.3758/BF03197276</pub-id>
<pub-id pub-id-type="pmid">8822162</pub-id>
</mixed-citation>
</ref>
<ref id="B32">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kali</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Dayan</surname>
<given-names>P.</given-names>
</name>
</person-group>
(
<year>2004</year>
).
<article-title>Off-line replay maintains declarative memories in a model of hippocampal-neocortical interactions</article-title>
.
<source>Nat. Neurosci.</source>
<volume>7</volume>
,
<fpage>286</fpage>
<lpage>294</lpage>
<pub-id pub-id-type="doi">10.1038/nn1202</pub-id>
<pub-id pub-id-type="pmid">14983183</pub-id>
</mixed-citation>
</ref>
<ref id="B33">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kemp</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Tenenbaum</surname>
<given-names>J. B.</given-names>
</name>
</person-group>
(
<year>2008</year>
).
<article-title>The discovery of structural form</article-title>
.
<source>Proc. Natl. Acad. Sci. U.S.A.</source>
<volume>105</volume>
,
<fpage>10687</fpage>
<lpage>10692</lpage>
<pub-id pub-id-type="doi">10.1073/pnas.0802631105</pub-id>
<pub-id pub-id-type="pmid">18669663</pub-id>
</mixed-citation>
</ref>
<ref id="B34">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kim</surname>
<given-names>J. J.</given-names>
</name>
<name>
<surname>Fanselow</surname>
<given-names>M. S.</given-names>
</name>
</person-group>
(
<year>1992</year>
).
<article-title>Modality-specific retrograde amnesia of fear</article-title>
.
<source>Science</source>
<volume>256</volume>
,
<fpage>675</fpage>
<lpage>677</lpage>
<pub-id pub-id-type="doi">10.1126/science.1585183</pub-id>
<pub-id pub-id-type="pmid">1585183</pub-id>
</mixed-citation>
</ref>
<ref id="B35">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Koőrding</surname>
<given-names>K. P.</given-names>
</name>
<name>
<surname>Wolpert</surname>
<given-names>D. M.</given-names>
</name>
</person-group>
(
<year>2004</year>
).
<article-title>Bayesian integration in sensorimotor learning</article-title>
.
<source>Nature</source>
<volume>427</volume>
,
<fpage>244</fpage>
<lpage>247</lpage>
<pub-id pub-id-type="doi">10.1038/nature02169</pub-id>
<pub-id pub-id-type="pmid">14724638</pub-id>
</mixed-citation>
</ref>
<ref id="B36">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kudrimoti</surname>
<given-names>H. S.</given-names>
</name>
<name>
<surname>Barnes</surname>
<given-names>C. A.</given-names>
</name>
<name>
<surname>McNaughton</surname>
<given-names>B. L.</given-names>
</name>
</person-group>
(
<year>1999</year>
).
<article-title>Reactivation of hippocampal cell assemblies: effects of behavioral state, experience, and eeg dynamics</article-title>
.
<source>J. Neurosci.</source>
<volume>19</volume>
,
<fpage>4090</fpage>
<lpage>4101</lpage>
<pub-id pub-id-type="pmid">10234037</pub-id>
</mixed-citation>
</ref>
<ref id="B37">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lari</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Young</surname>
<given-names>S. J.</given-names>
</name>
</person-group>
(
<year>1990</year>
).
<article-title>The estimation of stochastic context-free grammars using the inside-outside algorithm</article-title>
.
<source>Comput. Speech Lang.</source>
<volume>4</volume>
,
<fpage>35</fpage>
<lpage>56</lpage>
<pub-id pub-id-type="doi">10.1016/0885-2308(90)90022-X</pub-id>
</mixed-citation>
</ref>
<ref id="B38">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lee</surname>
<given-names>A. K.</given-names>
</name>
<name>
<surname>Wilson</surname>
<given-names>M. A.</given-names>
</name>
</person-group>
(
<year>2002</year>
).
<article-title>Memory of sequential experience in the hippocampus during slow wave sleep</article-title>
.
<source>Neuron</source>
<volume>36</volume>
,
<fpage>1183</fpage>
<lpage>1194</lpage>
<pub-id pub-id-type="doi">10.1016/S0896-6273(02)01024-3</pub-id>
<pub-id pub-id-type="pmid">12495631</pub-id>
</mixed-citation>
</ref>
<ref id="B39">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lee</surname>
<given-names>T. S.</given-names>
</name>
<name>
<surname>Mumford</surname>
<given-names>D.</given-names>
</name>
</person-group>
(
<year>2003</year>
).
<article-title>Hierarchical Bayesian inference in the visual cortex</article-title>
.
<source>J. Opt. Soc. Am. A</source>
<volume>20</volume>
,
<fpage>1434</fpage>
<lpage>1448</lpage>
<pub-id pub-id-type="doi">10.1364/JOSAA.20.001434</pub-id>
</mixed-citation>
</ref>
<ref id="B40">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ljungberg</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Apicella</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Schultz</surname>
<given-names>W.</given-names>
</name>
</person-group>
(
<year>1992</year>
).
<article-title>Responses of monkey dopamine neurons during learning of behavioral reactions</article-title>
.
<source>J. Neurophysiol.</source>
<volume>67</volume>
,
<fpage>145</fpage>
<lpage>163</lpage>
<pub-id pub-id-type="pmid">1552316</pub-id>
</mixed-citation>
</ref>
<ref id="B41">
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Manning</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Schütze</surname>
<given-names>H.</given-names>
</name>
</person-group>
(
<year>1999</year>
).
<source>Foundations of Statistical Natural Language Processing</source>
.
<publisher-loc>Cambridge, MA</publisher-loc>
:
<publisher-name>MIT Press</publisher-name>
</mixed-citation>
</ref>
<ref id="B42">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Markram</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Lubke</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Frotscher</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Sakmann</surname>
<given-names>B.</given-names>
</name>
</person-group>
(
<year>1997</year>
).
<article-title>Regulation of synaptic efficacy by coincidence of postsynaptic aps and epsps</article-title>
.
<source>Science</source>
<volume>275</volume>
,
<fpage>213</fpage>
<lpage>215</lpage>
<pub-id pub-id-type="doi">10.1126/science.275.5297.213</pub-id>
<pub-id pub-id-type="pmid">8985014</pub-id>
</mixed-citation>
</ref>
<ref id="B43">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Marr</surname>
<given-names>D.</given-names>
</name>
</person-group>
(
<year>1971</year>
).
<article-title>Simple memory: a theory for archicortex</article-title>
.
<source>Philos. Trans. R. Soc. Lond. B Biol. Sci.</source>
<volume>262</volume>
,
<fpage>23</fpage>
<lpage>81</lpage>
<pub-id pub-id-type="doi">10.1098/rstb.1971.0078</pub-id>
<pub-id pub-id-type="pmid">4399412</pub-id>
</mixed-citation>
</ref>
<ref id="B44">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Maviel</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Durkin</surname>
<given-names>T. P.</given-names>
</name>
<name>
<surname>Menzaghi</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Bontempi</surname>
<given-names>B.</given-names>
</name>
</person-group>
(
<year>2004</year>
).
<article-title>Sites of neocortical reorganization critical for remote spatial memory</article-title>
.
<source>Science</source>
<volume>305</volume>
,
<fpage>96</fpage>
<lpage>99</lpage>
<pub-id pub-id-type="doi">10.1126/science.1098180</pub-id>
<pub-id pub-id-type="pmid">15232109</pub-id>
</mixed-citation>
</ref>
<ref id="B45">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>McClelland</surname>
<given-names>J. L.</given-names>
</name>
<name>
<surname>McNaughton</surname>
<given-names>B. L.</given-names>
</name>
<name>
<surname>O'Reilly</surname>
<given-names>R. C.</given-names>
</name>
</person-group>
(
<year>1995</year>
).
<article-title>Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory</article-title>
.
<source>Psychol. Rev.</source>
<volume>102</volume>
,
<fpage>419</fpage>
<lpage>457</lpage>
<pub-id pub-id-type="doi">10.1037/0033-295X.102.3.419</pub-id>
<pub-id pub-id-type="pmid">7624455</pub-id>
</mixed-citation>
</ref>
<ref id="B46">
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>McNaughton</surname>
<given-names>B. L.</given-names>
</name>
<name>
<surname>Barnes</surname>
<given-names>C. A.</given-names>
</name>
<name>
<surname>Battaglia</surname>
<given-names>F. P.</given-names>
</name>
<name>
<surname>Bower</surname>
<given-names>M. R.</given-names>
</name>
<name>
<surname>Cowen</surname>
<given-names>S. L.</given-names>
</name>
<name>
<surname>Ekstrom</surname>
<given-names>A. D.</given-names>
</name>
<name>
<surname>Gerrard</surname>
<given-names>J. L.</given-names>
</name>
<name>
<surname>Hoffman</surname>
<given-names>K. L.</given-names>
</name>
<name>
<surname>Houston</surname>
<given-names>F. P.</given-names>
</name>
<name>
<surname>Karten</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Lipa</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Pennartz</surname>
<given-names>C. M.</given-names>
</name>
<name>
<surname>Sutherland</surname>
<given-names>G. R.</given-names>
</name>
</person-group>
(
<year>2002</year>
).
<article-title>“Off-line reprocessing of recent memory and its role in memory consolidation: a progress report,”</article-title>
in
<source>Sleep and Brain Plasticity</source>
, eds
<person-group person-group-type="editor">
<name>
<surname>Maquet</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Stickgold</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Smith</surname>
<given-names>C.</given-names>
</name>
</person-group>
(
<publisher-loc>Oxford</publisher-loc>
:
<publisher-name>Oxford University Press</publisher-name>
).</mixed-citation>
</ref>
<ref id="B47">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>McNaughton</surname>
<given-names>B. L.</given-names>
</name>
<name>
<surname>Morris</surname>
<given-names>R. G. M.</given-names>
</name>
</person-group>
(
<year>1987</year>
).
<article-title>Hippocampal synaptic enhancement and information storage within a distributed memory system</article-title>
.
<source>Trends Neurosci.</source>
<volume>10</volume>
,
<fpage>408</fpage>
<lpage>415</lpage>
<pub-id pub-id-type="doi">10.1016/0166-2236(87)90011-7</pub-id>
</mixed-citation>
</ref>
<ref id="B48">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mesulam</surname>
<given-names>M. M.</given-names>
</name>
<name>
<surname>Mufson</surname>
<given-names>E. J.</given-names>
</name>
</person-group>
(
<year>1984</year>
).
<article-title>Neural inputs into the nucleus basalis of the substantia innominata (ch4) in the rhesus monkey</article-title>
.
<source>Brain</source>
<volume>107</volume>
(
<issue>Pt 1</issue>
),
<fpage>253</fpage>
<lpage>274</lpage>
<pub-id pub-id-type="pmid">6538106</pub-id>
</mixed-citation>
</ref>
<ref id="B49">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Moscovitch</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Nadel</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Winocur</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Gilboa</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Rosenbaum</surname>
<given-names>R. S.</given-names>
</name>
</person-group>
(
<year>2006</year>
).
<article-title>The cognitive neuroscience of remote episodic, semantic and spatial memory</article-title>
.
<source>Curr. Opin. Neurobiol.</source>
<volume>16</volume>
,
<fpage>179</fpage>
<lpage>190</lpage>
<pub-id pub-id-type="doi">10.1016/j.conb.2006.03.013</pub-id>
<pub-id pub-id-type="pmid">16564688</pub-id>
</mixed-citation>
</ref>
<ref id="B50">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Moscovitch</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Rosenbaum</surname>
<given-names>R. S.</given-names>
</name>
<name>
<surname>Gilboa</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Addis</surname>
<given-names>D. R.</given-names>
</name>
<name>
<surname>Westmacott</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Grady</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>McAndrews</surname>
<given-names>M. P.</given-names>
</name>
<name>
<surname>Levine</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Black</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Winocur</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Nadel</surname>
<given-names>L.</given-names>
</name>
</person-group>
(
<year>2005</year>
).
<article-title>Functional neuroanatomy of remote episodic, semantic and spatial memory: a unified account based on multiple trace theory</article-title>
.
<source>J. Anat.</source>
<volume>207</volume>
,
<fpage>35</fpage>
<lpage>66</lpage>
<pub-id pub-id-type="doi">10.1111/j.1469-7580.2005.00421.x</pub-id>
<pub-id pub-id-type="pmid">16011544</pub-id>
</mixed-citation>
</ref>
<ref id="B51">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Neal</surname>
<given-names>R. M.</given-names>
</name>
<name>
<surname>Hinton</surname>
<given-names>G. E.</given-names>
</name>
</person-group>
(
<year>1998</year>
).
<article-title>A view of the em algorithm that justifies incremental, sparse, and other variants</article-title>
.
<source>Learn. Graph. Models</source>
<volume>89</volume>
,
<fpage>355</fpage>
<lpage>368</lpage>
</mixed-citation>
</ref>
<ref id="B52">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>O'Neill</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Senior</surname>
<given-names>T. J.</given-names>
</name>
<name>
<surname>Allen</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Huxter</surname>
<given-names>J. R.</given-names>
</name>
<name>
<surname>Csicsvari</surname>
<given-names>J.</given-names>
</name>
</person-group>
(
<year>2008</year>
).
<article-title>Reactivation of experience-dependent cell assembly patterns in the hippocampus</article-title>
.
<source>Nat. Neurosci.</source>
<volume>11</volume>
,
<fpage>209</fpage>
<lpage>215</lpage>
<pub-id pub-id-type="doi">10.1038/nn2037</pub-id>
<pub-id pub-id-type="pmid">18193040</pub-id>
</mixed-citation>
</ref>
<ref id="B53">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Orbán</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Fiser</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Aslin</surname>
<given-names>R. N.</given-names>
</name>
<name>
<surname>Lengyel</surname>
<given-names>M.</given-names>
</name>
</person-group>
(
<year>2008</year>
).
<article-title>Bayesian learning of visual chunks by human observers</article-title>
.
<source>Proc. Natl. Acad. Sci. U.S.A.</source>
<volume>105</volume>
,
<fpage>2745</fpage>
<lpage>2750</lpage>
<pub-id pub-id-type="doi">10.1073/pnas.0708424105</pub-id>
<pub-id pub-id-type="pmid">18268353</pub-id>
</mixed-citation>
</ref>
<ref id="B54">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Peyrache</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Khamassi</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Benchenane</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Wiener</surname>
<given-names>S. I.</given-names>
</name>
<name>
<surname>Battaglia</surname>
<given-names>F. P.</given-names>
</name>
</person-group>
(
<year>2009</year>
).
<article-title>Replay of rule learning related neural patterns in the prefrontal cortex during sleep</article-title>
.
<source>Nat. Neurosci.</source>
<volume>12</volume>
,
<fpage>919</fpage>
<lpage>926</lpage>
<pub-id pub-id-type="doi">10.1038/nn.2337</pub-id>
<pub-id pub-id-type="pmid">19483687</pub-id>
</mixed-citation>
</ref>
<ref id="B55">
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Quillian</surname>
<given-names>M. R.</given-names>
</name>
</person-group>
(
<year>1968</year>
).
<article-title>“Semantic memory,”</article-title>
in
<source>Semantic Information Processing</source>
, ed.
<person-group person-group-type="editor">
<name>
<surname>Minsky</surname>
<given-names>M.</given-names>
</name>
</person-group>
(
<publisher-loc>Cambridge, MA</publisher-loc>
:
<publisher-name>MIT Press</publisher-name>
),
<fpage>227</fpage>
<lpage>270</lpage>
</mixed-citation>
</ref>
<ref id="B56">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rabiner</surname>
<given-names>L. R.</given-names>
</name>
</person-group>
(
<year>1989</year>
).
<article-title>A tutorial on hidden markov models and selected applications in speech recognition</article-title>
.
<source>Proc. IEEE</source>
<volume>77</volume>
,
<fpage>257</fpage>
<lpage>285</lpage>
<pub-id pub-id-type="doi">10.1109/5.18626</pub-id>
</mixed-citation>
</ref>
<ref id="B57">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rao</surname>
<given-names>R. P.</given-names>
</name>
<name>
<surname>Ballard</surname>
<given-names>D. H.</given-names>
</name>
</person-group>
(
<year>1999</year>
).
<article-title>Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects</article-title>
.
<source>Nat. Neurosci.</source>
<volume>2</volume>
,
<fpage>79</fpage>
<lpage>87</lpage>
<pub-id pub-id-type="doi">10.1038/4580</pub-id>
<pub-id pub-id-type="pmid">10195184</pub-id>
</mixed-citation>
</ref>
<ref id="B58">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rasch</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Born</surname>
<given-names>J.</given-names>
</name>
</person-group>
(
<year>2008</year>
).
<article-title>Maintaining memories by reactivation</article-title>
.
<source>Curr. Opin. Neurobiol.</source>
<volume>17</volume>
,
<fpage>698</fpage>
<lpage>703</lpage>
<pub-id pub-id-type="doi">10.1016/j.conb.2007.11.007</pub-id>
<pub-id pub-id-type="pmid">18222688</pub-id>
</mixed-citation>
</ref>
<ref id="B59">
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Raven</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Raven</surname>
<given-names>J. C.</given-names>
</name>
<name>
<surname>Court</surname>
<given-names>J. H.</given-names>
</name>
</person-group>
(
<year>1998</year>
).
<source>Raven Manual: Sec. 4. Advanced Progressive Matrices</source>
.
<publisher-loc>Oxford</publisher-loc>
:
<publisher-name>Oxford Psychologists Press</publisher-name>
</mixed-citation>
</ref>
<ref id="B60">
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Rogers</surname>
<given-names>T. T.</given-names>
</name>
<name>
<surname>McClelland</surname>
<given-names>J. L.</given-names>
</name>
</person-group>
(
<year>2004</year>
).
<source>Semantic Cognition: A Parallel Distributed Processing Approach</source>
.
<publisher-loc>Cambridge, MA</publisher-loc>
:
<publisher-name>MIT Press</publisher-name>
</mixed-citation>
</ref>
<ref id="B61">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rohlf</surname>
<given-names>F. J.</given-names>
</name>
</person-group>
(
<year>1983</year>
).
<article-title>Numbering binary trees with labeled terminal vertices</article-title>
.
<source>Bull. Math. Biol.</source>
<volume>45</volume>
,
<fpage>33</fpage>
<lpage>40</lpage>
<pub-id pub-id-type="doi">10.1007/BF02459385</pub-id>
<pub-id pub-id-type="pmid">6850159</pub-id>
</mixed-citation>
</ref>
<ref id="B62">
<mixed-citation publication-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Savova</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Jakel</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Tenenbaum</surname>
<given-names>J. B.</given-names>
</name>
</person-group>
(
<year>2009</year>
).
<article-title>“Grammar-based object representations in a scene parsing task,”</article-title>
in
<conf-name>Cognitive Science Conference</conf-name>
,
<conf-loc>Amsterdam</conf-loc>
</mixed-citation>
</ref>
<ref id="B63">
<mixed-citation publication-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Savova</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Tenenbaum</surname>
<given-names>J. B.</given-names>
</name>
</person-group>
(
<year>2008</year>
).
<article-title>“A grammar-based approach to visual category learning,”</article-title>
in
<conf-name>Cognitive Science Conference</conf-name>
,
<conf-loc>Washington, DC</conf-loc>
</mixed-citation>
</ref>
<ref id="B64">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Schabus</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Hödlmoser</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Gruber</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Sauter</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Anderer</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Klösch</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Parapatics</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Saletu</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Klimesch</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Zeitlhofer</surname>
<given-names>J.</given-names>
</name>
</person-group>
(
<year>2006</year>
).
<article-title>Sleep spindle-related activity in the human eeg and its relation to general cognitive and learning abilities</article-title>
.
<source>Eur. J. Neurosci.</source>
<volume>23</volume>
,
<fpage>1738</fpage>
<lpage>1746</lpage>
<pub-id pub-id-type="doi">10.1111/j.1460-9568.2006.04694.x</pub-id>
<pub-id pub-id-type="pmid">16623830</pub-id>
</mixed-citation>
</ref>
<ref id="B65">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Scoville</surname>
<given-names>W. B.</given-names>
</name>
<name>
<surname>Milner</surname>
<given-names>B.</given-names>
</name>
</person-group>
(
<year>1957</year>
).
<article-title>Loss of recent memory after bilateral hippocampal lesions</article-title>
.
<source>J. Neurol. Neurosurg. Psychiatr.</source>
<volume>20</volume>
,
<fpage>11</fpage>
<lpage>21</lpage>
<pub-id pub-id-type="doi">10.1136/jnnp.20.1.11</pub-id>
<pub-id pub-id-type="pmid">13406589</pub-id>
</mixed-citation>
</ref>
<ref id="B66">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shastri</surname>
<given-names>L.</given-names>
</name>
</person-group>
(
<year>2002</year>
).
<article-title>Episodic memory and cortico-hippocampal interactions</article-title>
.
<source>Trends Cogn. Sci. (Regul. Ed.)</source>
<volume>6</volume>
,
<fpage>162</fpage>
<lpage>168</lpage>
<pub-id pub-id-type="doi">10.1016/S1364-6613(02)01868-5</pub-id>
<pub-id pub-id-type="pmid">11912039</pub-id>
</mixed-citation>
</ref>
<ref id="B67">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shen</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>McNaughton</surname>
<given-names>B. L.</given-names>
</name>
</person-group>
(
<year>1996</year>
).
<article-title>Modeling the spontaneous reactivation of experience-specific hippocampal cell assembles during sleep</article-title>
.
<source>Hippocampus</source>
<volume>6</volume>
,
<fpage>685</fpage>
<lpage>692</lpage>
<pub-id pub-id-type="doi">10.1002/(SICI)1098-1063(1996)6:6<685::AID-HIPO11>3.3.CO;2-M</pub-id>
<pub-id pub-id-type="pmid">9034855</pub-id>
</mixed-citation>
</ref>
<ref id="B68">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Siapas</surname>
<given-names>A. G.</given-names>
</name>
<name>
<surname>Wilson</surname>
<given-names>M. A.</given-names>
</name>
</person-group>
(
<year>1998</year>
).
<article-title>Coordinated interactions between hippocampal ripples and cortical spindles during slow-wave sleep</article-title>
.
<source>Neuron</source>
<volume>21</volume>
,
<fpage>1123</fpage>
<lpage>1128</lpage>
<pub-id pub-id-type="doi">10.1016/S0896-6273(00)80629-7</pub-id>
<pub-id pub-id-type="pmid">9856467</pub-id>
</mixed-citation>
</ref>
<ref id="B69">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Simoncelli</surname>
<given-names>E. P.</given-names>
</name>
<name>
<surname>Olshausen</surname>
<given-names>B. A.</given-names>
</name>
</person-group>
(
<year>2001</year>
).
<article-title>Natural image statistics and neural representation</article-title>
.
<source>Annu. Rev. Neurosci.</source>
<volume>24</volume>
,
<fpage>1193</fpage>
<lpage>1216</lpage>
<pub-id pub-id-type="doi">10.1146/annurev.neuro.24.1.1193</pub-id>
<pub-id pub-id-type="pmid">11520932</pub-id>
</mixed-citation>
</ref>
<ref id="B70">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sirota</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Csicsvari</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Buhl</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Buzsáki</surname>
<given-names>G.</given-names>
</name>
</person-group>
(
<year>2003</year>
).
<article-title>Communication between neocortex and hippocampus during sleep in rodents</article-title>
.
<source>Proc. Natl. Acad. Sci. U.S.A.</source>
<volume>100</volume>
,
<fpage>2065</fpage>
<lpage>2069</lpage>
<pub-id pub-id-type="doi">10.1073/pnas.0437938100</pub-id>
<pub-id pub-id-type="pmid">12576550</pub-id>
</mixed-citation>
</ref>
<ref id="B71">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Skaggs</surname>
<given-names>W. E.</given-names>
</name>
<name>
<surname>McNaughton</surname>
<given-names>B. L.</given-names>
</name>
</person-group>
(
<year>1996</year>
).
<article-title>Replay of neuronal firing sequences in rat hippocampus during sleep following spatial experience</article-title>
.
<source>Science</source>
<volume>271</volume>
,
<fpage>1870</fpage>
<lpage>1873</lpage>
<pub-id pub-id-type="doi">10.1126/science.271.5257.1870</pub-id>
<pub-id pub-id-type="pmid">8596957</pub-id>
</mixed-citation>
</ref>
<ref id="B72">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Squire</surname>
<given-names>L. R.</given-names>
</name>
</person-group>
(
<year>1982</year>
).
<article-title>The neuropsychology of human memory</article-title>
.
<source>Annu. Rev. Neurosci.</source>
<volume>5</volume>
,
<fpage>241</fpage>
<lpage>273</lpage>
<pub-id pub-id-type="doi">10.1146/annurev.ne.05.030182.001325</pub-id>
<pub-id pub-id-type="pmid">7073209</pub-id>
</mixed-citation>
</ref>
<ref id="B73">
<mixed-citation publication-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Stolcke</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Omohundro</surname>
<given-names>S.</given-names>
</name>
</person-group>
(
<year>1994</year>
).
<article-title>“Inducing probabilistic grammars by Bayesian model merging,”</article-title>
in
<conf-name>Proceedings of 2nd International Colloquium on Grammatical Inference, ICGI’94</conf-name>
,
<conf-loc>Alicante</conf-loc>
</mixed-citation>
</ref>
<ref id="B74">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sutherland</surname>
<given-names>R. J.</given-names>
</name>
<name>
<surname>Rudy</surname>
<given-names>J. W.</given-names>
</name>
</person-group>
(
<year>1989</year>
).
<article-title>Configural association theory: the role of the hippocampal formation in learning, memory, and amnesia</article-title>
.
<source>Psychobiology</source>
<volume>17</volume>
,
<fpage>129</fpage>
<lpage>144</lpage>
</mixed-citation>
</ref>
<ref id="B75">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Takashima</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Petersson</surname>
<given-names>K. M.</given-names>
</name>
<name>
<surname>Rutters</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Tendolkar</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Jensen</surname>
<given-names>O.</given-names>
</name>
<name>
<surname>Zwarts</surname>
<given-names>M. J.</given-names>
</name>
<name>
<surname>McNaughton</surname>
<given-names>B. L.</given-names>
</name>
<name>
<surname>Fernández</surname>
<given-names>G.</given-names>
</name>
</person-group>
(
<year>2006</year>
).
<article-title>Declarative memory consolidation in humans: a prospective functional magnetic resonance imaging study</article-title>
.
<source>Proc. Natl. Acad. Sci. U.S.A.</source>
<volume>103</volume>
,
<fpage>756</fpage>
<lpage>761</lpage>
<pub-id pub-id-type="doi">10.1073/pnas.0507774103</pub-id>
<pub-id pub-id-type="pmid">16407110</pub-id>
</mixed-citation>
</ref>
<ref id="B76">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Teyler</surname>
<given-names>T. J.</given-names>
</name>
<name>
<surname>DiScenna</surname>
<given-names>P.</given-names>
</name>
</person-group>
(
<year>1986</year>
).
<article-title>The hippocampal memory indexing theory</article-title>
.
<source>Behav. Neurosci.</source>
<volume>100</volume>
,
<fpage>147</fpage>
<lpage>154</lpage>
<pub-id pub-id-type="doi">10.1037/0735-7044.100.2.147</pub-id>
<pub-id pub-id-type="pmid">3008780</pub-id>
</mixed-citation>
</ref>
<ref id="B77">
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Tomasello</surname>
<given-names>M.</given-names>
</name>
</person-group>
(
<year>2005</year>
).
<source>Constructing a Language: A Usage-Based Theory of Language Acquisition</source>
.
<publisher-loc>Cambridge, MA</publisher-loc>
:
<publisher-name>Harvard University Press</publisher-name>
</mixed-citation>
</ref>
<ref id="B78">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Treves</surname>
<given-names>A.</given-names>
</name>
</person-group>
(
<year>2005</year>
).
<article-title>Frontal latching networks: a possible neural basis for infinite recursion</article-title>
.
<source>Cogn. Neuropsychol.</source>
<volume>22</volume>
,
<fpage>276</fpage>
<lpage>291</lpage>
<pub-id pub-id-type="doi">10.1080/02643290442000329</pub-id>
<pub-id pub-id-type="pmid">21038250</pub-id>
</mixed-citation>
</ref>
<ref id="B79">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Treves</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Rolls</surname>
<given-names>E. T.</given-names>
</name>
</person-group>
(
<year>1994</year>
).
<article-title>Computational analysis of the role of the hippocampus in memory</article-title>
.
<source>Hippocampus</source>
<volume>4</volume>
,
<fpage>374</fpage>
<lpage>391</lpage>
<pub-id pub-id-type="doi">10.1002/hipo.450040319</pub-id>
<pub-id pub-id-type="pmid">7842058</pub-id>
</mixed-citation>
</ref>
<ref id="B80">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tse</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Langston</surname>
<given-names>R. F.</given-names>
</name>
<name>
<surname>Kakeyama</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Bethus</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Spooner</surname>
<given-names>P. A.</given-names>
</name>
<name>
<surname>Wood</surname>
<given-names>E. R.</given-names>
</name>
<name>
<surname>Witter</surname>
<given-names>M. P.</given-names>
</name>
<name>
<surname>Morris</surname>
<given-names>R. G. M.</given-names>
</name>
</person-group>
(
<year>2007</year>
).
<article-title>Schemas and memory consolidation</article-title>
.
<source>Science</source>
<volume>316</volume>
,
<fpage>76</fpage>
<lpage>82</lpage>
<pub-id pub-id-type="doi">10.1126/science.1135935</pub-id>
<pub-id pub-id-type="pmid">17412951</pub-id>
</mixed-citation>
</ref>
<ref id="B81">
<mixed-citation publication-type="book">
<person-group person-group-type="editor">
<name>
<surname>Tulving</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Craik</surname>
<given-names>F. I. M.</given-names>
</name>
</person-group>
(eds). (
<year>2000</year>
).
<source>Concepts of Memory</source>
.
<publisher-loc>New York</publisher-loc>
:
<publisher-name>Oxford University Press</publisher-name>
,
<fpage>33</fpage>
<lpage>44</lpage>
</mixed-citation>
</ref>
<ref id="B82">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ullman</surname>
<given-names>S.</given-names>
</name>
</person-group>
(
<year>2007</year>
).
<article-title>Object recognition and segmentation by a fragment-based hierarchy</article-title>
.
<source>Trends Cogn. Sci. (Regul. Ed.)</source>
<volume>11</volume>
,
<fpage>58</fpage>
<lpage>64</lpage>
<pub-id pub-id-type="doi">10.1016/j.tics.2006.11.009</pub-id>
<pub-id pub-id-type="pmid">17188555</pub-id>
</mixed-citation>
</ref>
<ref id="B83">
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Valdes</surname>
<given-names>J. L.</given-names>
</name>
<name>
<surname>McNaughton</surname>
<given-names>B. L.</given-names>
</name>
<name>
<surname>Fellous</surname>
<given-names>J. M.</given-names>
</name>
</person-group>
(
<year>2008</year>
).
<source>Reactivation of populations of Ventral Tegmental Area neurons in the rat</source>
. Abstract 687.19,
<publisher-name>Society for Neuroscience</publisher-name>
,
<publisher-loc>Washington, DC</publisher-loc>
</mixed-citation>
</ref>
<ref id="B84">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wagner</surname>
<given-names>U.</given-names>
</name>
<name>
<surname>Gais</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Haider</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Verleger</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Born</surname>
<given-names>J.</given-names>
</name>
</person-group>
(
<year>2004</year>
).
<article-title>Sleep inspires insight</article-title>
.
<source>Nature</source>
<volume>427</volume>
,
<fpage>352</fpage>
<lpage>355</lpage>
<pub-id pub-id-type="doi">10.1038/nature02223</pub-id>
<pub-id pub-id-type="pmid">14737168</pub-id>
</mixed-citation>
</ref>
<ref id="B85">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wei</surname>
<given-names>G. C. G.</given-names>
</name>
<name>
<surname>Tanner</surname>
<given-names>M. A.</given-names>
</name>
</person-group>
(
<year>1990</year>
).
<article-title>A monte carlo implementation of the em algorithm and the poor manÕs data augmentation algorithms</article-title>
.
<source>J. Am. Stat. Assoc.</source>
<volume>85</volume>
,
<fpage>699</fpage>
<lpage>704</lpage>
<pub-id pub-id-type="doi">10.2307/2289560</pub-id>
</mixed-citation>
</ref>
<ref id="B86">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wilson</surname>
<given-names>M. A.</given-names>
</name>
<name>
<surname>McNaughton</surname>
<given-names>B. L.</given-names>
</name>
</person-group>
(
<year>1994</year>
).
<article-title>Reactivation of hippocampal ensemble memories during sleep</article-title>
.
<source>Science</source>
<volume>265</volume>
,
<fpage>676</fpage>
<lpage>679</lpage>
<pub-id pub-id-type="doi">10.1126/science.8036517</pub-id>
<pub-id pub-id-type="pmid">8036517</pub-id>
</mixed-citation>
</ref>
<ref id="B87">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Winocur</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Moscovitch</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Sekeres</surname>
<given-names>M.</given-names>
</name>
</person-group>
(
<year>2007</year>
).
<article-title>Memory consolidation or transformation: context manipulation and hippocampal representations of memory</article-title>
.
<source>Nat. Neurosci.</source>
<volume>10</volume>
,
<fpage>555</fpage>
<lpage>557</lpage>
<pub-id pub-id-type="doi">10.1038/nn1880</pub-id>
<pub-id pub-id-type="pmid">17396121</pub-id>
</mixed-citation>
</ref>
<ref id="B88">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yordanova</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Kolev</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Wagner</surname>
<given-names>U.</given-names>
</name>
<name>
<surname>Verleger</surname>
<given-names>R.</given-names>
</name>
</person-group>
(
<year>2009</year>
).
<article-title>Covert reorganization of implicit task representations by slow wave sleep</article-title>
.
<source>PLoS ONE</source>
<volume>4</volume>
,
<fpage>e5675</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pone.0005675</pub-id>
<pub-id pub-id-type="pmid">19479080</pub-id>
</mixed-citation>
</ref>
<ref id="B89">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yu</surname>
<given-names>A. J.</given-names>
</name>
<name>
<surname>Dayan</surname>
<given-names>P.</given-names>
</name>
</person-group>
(
<year>2005</year>
).
<article-title>Uncertainty, neuromodulation, and attention</article-title>
.
<source>Neuron</source>
<volume>46</volume>
,
<fpage>681</fpage>
<lpage>692</lpage>
<pub-id pub-id-type="doi">10.1016/j.neuron.2005.04.026</pub-id>
<pub-id pub-id-type="pmid">15944135</pub-id>
</mixed-citation>
</ref>
<ref id="B90">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zaborszky</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Gaykema</surname>
<given-names>R. P.</given-names>
</name>
<name>
<surname>Swanson</surname>
<given-names>D. J.</given-names>
</name>
<name>
<surname>Cullinan</surname>
<given-names>W. E.</given-names>
</name>
</person-group>
(
<year>1997</year>
).
<article-title>Cortical input to the basal forebrain</article-title>
.
<source>Neuroscience</source>
<volume>79</volume>
,
<fpage>1051</fpage>
<lpage>1078</lpage>
<pub-id pub-id-type="doi">10.1016/S0306-4522(97)00049-3</pub-id>
<pub-id pub-id-type="pmid">9219967</pub-id>
</mixed-citation>
</ref>
<ref id="B91">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhu</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Yuille</surname>
<given-names>A.</given-names>
</name>
</person-group>
(
<year>2007</year>
).
<article-title>Unsupervised learning of a probabilistic grammar for object detection and parsing</article-title>
.
<source>Adv. Neural Inf. Process Syst.</source>
<volume>19</volume>
,
<fpage>1617</fpage>
</mixed-citation>
</ref>
<ref id="B92">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zola-Morgan</surname>
<given-names>S. M.</given-names>
</name>
<name>
<surname>Squire</surname>
<given-names>L. R.</given-names>
</name>
</person-group>
(
<year>1990</year>
).
<article-title>The primate hippocampal formation: evidence for a time-limited role in memory storage</article-title>
.
<source>Science</source>
<volume>250</volume>
,
<fpage>288</fpage>
<lpage>290</lpage>
<pub-id pub-id-type="doi">10.1126/science.2218534</pub-id>
<pub-id pub-id-type="pmid">2218534</pub-id>
</mixed-citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/HapticV1/Data/Pmc/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001D19 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Curation/biblio.hfd -nk 001D19 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    HapticV1
   |flux=    Pmc
   |étape=   Curation
   |type=    RBID
   |clé=     PMC:3157741
   |texte=   The Construction of Semantic Memory: Grammar-Based Representations Learned from Relational Episodic Information
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Curation/RBID.i   -Sk "pubmed:21887143" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Curation/biblio.hfd   \
       | NlmPubMed2Wicri -a HapticV1 

Wicri

This area was generated with Dilib version V0.6.23.
Data generation: Mon Jun 13 01:09:46 2016. Site generation: Wed Mar 6 09:54:07 2024