Serveur d'exploration sur SGML

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Identifying approximately common substructures in trees based on a restricted edit distance

Identifieur interne : 001D18 ( Main/Exploration ); précédent : 001D17; suivant : 001D19

Identifying approximately common substructures in trees based on a restricted edit distance

Auteurs : Jason T. L. Wang [États-Unis] ; Kaizhong Zhang [Canada] ; Chia-Yo Chang [États-Unis]

Source :

RBID : ISTEX:D2553AFFD17E1A621938606C9F99B56BB79A2091

English descriptors

Abstract

Abstract: In this paper we present a dynamic programming algorithm for identifying the largest approximately common substructures (LACSs) of two ordered labeled trees. We consider a substructure of a tree T to be a connected subgraph of T. Given two trees T1,T2 and an integer d, the LACS problem is to find a substructure U1 of T1 and a substructure U2 of T2 such that U1 is within distance d of U2 and where there does not exist any other substructure V1 of T1 and V2 of T2 such that V1 and V2 satisfy the distance restriction and the sum of the sizes of V1 and V2 is greater than the sum of the sizes of U1 and U2. The distance measure considered in the paper is a restricted edit distance originated from Selkow. The LACS problem is motivated by the studies of program and document comparison. The proposed algorithm solves the problem in time O(d2×|T1|×|T2|), which is as fast as the best known algorithm for calculating Selkow's distance of two trees when the distance allowed in the common substructures is a constant independent of the input trees.

Url:
DOI: 10.1016/S0020-0255(99)00100-0


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title>Identifying approximately common substructures in trees based on a restricted edit distance</title>
<author>
<name sortKey="Wang, Jason T L" sort="Wang, Jason T L" uniqKey="Wang J" first="Jason T. L." last="Wang">Jason T. L. Wang</name>
</author>
<author>
<name sortKey="Zhang, Kaizhong" sort="Zhang, Kaizhong" uniqKey="Zhang K" first="Kaizhong" last="Zhang">Kaizhong Zhang</name>
</author>
<author>
<name sortKey="Chang, Chia Yo" sort="Chang, Chia Yo" uniqKey="Chang C" first="Chia-Yo" last="Chang">Chia-Yo Chang</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:D2553AFFD17E1A621938606C9F99B56BB79A2091</idno>
<date when="1999" year="1999">1999</date>
<idno type="doi">10.1016/S0020-0255(99)00100-0</idno>
<idno type="url">https://api.istex.fr/ark:/67375/6H6-PXS7F0BT-7/fulltext.pdf</idno>
<idno type="wicri:Area/Istex/Corpus">003636</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Corpus" wicri:corpus="ISTEX">003636</idno>
<idno type="wicri:Area/Istex/Curation">002A50</idno>
<idno type="wicri:Area/Istex/Checkpoint">001B03</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Checkpoint">001B03</idno>
<idno type="wicri:doubleKey">0020-0255:1999:Wang J:identifying:approximately:common</idno>
<idno type="wicri:Area/Main/Merge">001D78</idno>
<idno type="wicri:Area/Main/Curation">001D18</idno>
<idno type="wicri:Area/Main/Exploration">001D18</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a">Identifying approximately common substructures in trees based on a restricted edit distance</title>
<author>
<name sortKey="Wang, Jason T L" sort="Wang, Jason T L" uniqKey="Wang J" first="Jason T. L." last="Wang">Jason T. L. Wang</name>
<affiliation wicri:level="2">
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Computer and Information Science, New Jersey Institute of Technology, University Heights, Newark, NJ 07102</wicri:regionArea>
<placeName>
<region type="state">New Jersey</region>
</placeName>
</affiliation>
<affiliation></affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
<author>
<name sortKey="Zhang, Kaizhong" sort="Zhang, Kaizhong" uniqKey="Zhang K" first="Kaizhong" last="Zhang">Kaizhong Zhang</name>
<affiliation wicri:level="1">
<country>Canada</country>
<wicri:regionArea>Department of Computer Science, The University of Western Ontario, London, Ont.</wicri:regionArea>
<wicri:noRegion>Ont.</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Canada</country>
</affiliation>
</author>
<author>
<name sortKey="Chang, Chia Yo" sort="Chang, Chia Yo" uniqKey="Chang C" first="Chia-Yo" last="Chang">Chia-Yo Chang</name>
<affiliation wicri:level="2">
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Computer and Information Science, New Jersey Institute of Technology, University Heights, Newark, NJ 07102</wicri:regionArea>
<placeName>
<region type="state">New Jersey</region>
</placeName>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="j">Information Sciences</title>
<title level="j" type="abbrev">INS</title>
<idno type="ISSN">0020-0255</idno>
<imprint>
<publisher>ELSEVIER</publisher>
<date type="published" when="1999">1999</date>
<biblScope unit="volume">121</biblScope>
<biblScope unit="issue">3–4</biblScope>
<biblScope unit="page" from="367">367</biblScope>
<biblScope unit="page" to="386">386</biblScope>
</imprint>
<idno type="ISSN">0020-0255</idno>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0020-0255</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="Teeft" xml:lang="en">
<term>Algorithm</term>
<term>Common substructure</term>
<term>Common substructures</term>
<term>Computer science</term>
<term>Consecutive sequence</term>
<term>Consistent subtree cuts</term>
<term>Distance measure</term>
<term>Distance value</term>
<term>Document databases</term>
<term>Dynamic programming</term>
<term>Information sciences</term>
<term>Lac</term>
<term>Lacs problem</term>
<term>Mapping</term>
<term>Maximum size</term>
<term>Minimum cost</term>
<term>Node</term>
<term>Optimal solution</term>
<term>Parse trees</term>
<term>Pattern discovery</term>
<term>Selkow</term>
<term>Shasha</term>
<term>Size values</term>
<term>Smallest sets</term>
<term>Substructure</term>
<term>Subtree</term>
<term>Subtrees</term>
<term>Wang</term>
<term>Zhang</term>
</keywords>
</textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Abstract: In this paper we present a dynamic programming algorithm for identifying the largest approximately common substructures (LACSs) of two ordered labeled trees. We consider a substructure of a tree T to be a connected subgraph of T. Given two trees T1,T2 and an integer d, the LACS problem is to find a substructure U1 of T1 and a substructure U2 of T2 such that U1 is within distance d of U2 and where there does not exist any other substructure V1 of T1 and V2 of T2 such that V1 and V2 satisfy the distance restriction and the sum of the sizes of V1 and V2 is greater than the sum of the sizes of U1 and U2. The distance measure considered in the paper is a restricted edit distance originated from Selkow. The LACS problem is motivated by the studies of program and document comparison. The proposed algorithm solves the problem in time O(d2×|T1|×|T2|), which is as fast as the best known algorithm for calculating Selkow's distance of two trees when the distance allowed in the common substructures is a constant independent of the input trees.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>Canada</li>
<li>États-Unis</li>
</country>
<region>
<li>New Jersey</li>
</region>
</list>
<tree>
<country name="États-Unis">
<region name="New Jersey">
<name sortKey="Wang, Jason T L" sort="Wang, Jason T L" uniqKey="Wang J" first="Jason T. L." last="Wang">Jason T. L. Wang</name>
</region>
<name sortKey="Chang, Chia Yo" sort="Chang, Chia Yo" uniqKey="Chang C" first="Chia-Yo" last="Chang">Chia-Yo Chang</name>
<name sortKey="Chang, Chia Yo" sort="Chang, Chia Yo" uniqKey="Chang C" first="Chia-Yo" last="Chang">Chia-Yo Chang</name>
<name sortKey="Wang, Jason T L" sort="Wang, Jason T L" uniqKey="Wang J" first="Jason T. L." last="Wang">Jason T. L. Wang</name>
</country>
<country name="Canada">
<noRegion>
<name sortKey="Zhang, Kaizhong" sort="Zhang, Kaizhong" uniqKey="Zhang K" first="Kaizhong" last="Zhang">Kaizhong Zhang</name>
</noRegion>
<name sortKey="Zhang, Kaizhong" sort="Zhang, Kaizhong" uniqKey="Zhang K" first="Kaizhong" last="Zhang">Kaizhong Zhang</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Informatique/explor/SgmlV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001D18 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 001D18 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Informatique
   |area=    SgmlV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:D2553AFFD17E1A621938606C9F99B56BB79A2091
   |texte=   Identifying approximately common substructures in trees based on a restricted edit distance
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Jul 1 14:26:08 2019. Site generation: Wed Apr 28 21:40:44 2021