OcrV1, Main, Merge, bibRecord, 001810

Compilation of dictionaries for semantic attribute analysis of television news captions

Identifieur interne : 001810 ( Main/Merge ); précédent : 001809; suivant : 001811

Compilation of dictionaries for semantic attribute analysis of television news captions

Auteurs : Ichiro Ide [Japon] ; Reiko Hamada [Japon] ; Shuichi Sakai [Japon] ; Hidehiko Tanaka [Japon]

Source :

Systems and Computers in Japan [ 0882-1666 ] ; 2003-11-15.

RBID : ISTEX:C0C0323E3979ABF70A5D71C305052E35B494F8AA

English descriptors

KwdEn :
- caption, dictionary, indexing, semantic attribute, suffix noun.

Abstract

With the increase in the amount of video that is broadcast daily, there is an increasing need for storage of video in a systematic way for future reuse and retrieval. In particular, from the viewpoint of importance and usability, it is desirable to index news videos. For adequate automatic indexing based on the text information in the video, it is not sufficient to apply the simple index extraction and annotation methods which have been widely used in conventional methods. It is important to select index candidates with reference to semantic attributes. The purpose of this study is to compile dictionaries which are needed for analyzing the semantic attributes of captions (noun phrases) in TV news videos. We describe the process by which words are extracted from text corpora and a thesaurus for storage on the basis of specified conditions. The quality of the dictionaries is examined by analysis of the semantic attributes of the words appearing in actual news videos, and the results are presented. In evaluation experiments in which an existing proper noun dictionary and temporal noun dictionary were combined and used, a recall of 79 to 93% and a precision of 41 to 71% were obtained. Although the precision is low in this result, it is concluded that the compiled dictionaries are of practical use for indexing since the recall is more important in that case. © 2003 Wiley Periodicals, Inc. Syst Comp Jpn, 34(12): 32–44, 2003; Published online in Wiley InterScience (www.interscience.wiley.com). DOI 10.1002/scj.10417

Url:

https://api.istex.fr/document/C0C0323E3979ABF70A5D71C305052E35B494F8AA/fulltext/pdf

DOI: 10.1002/scj.10417

Links toward previous steps (curation, corpus...)

to stream Istex, to step Corpus: 000736
to stream Istex, to step Curation: 000728
to stream Istex, to step Checkpoint: 000F06

Links to Exploration step

ISTEX:C0C0323E3979ABF70A5D71C305052E35B494F8AA

Le document en format XML

<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Compilation of dictionaries for semantic attribute analysis of television news captions</title>
<author><name sortKey="Ide, Ichiro" sort="Ide, Ichiro" uniqKey="Ide I" first="Ichiro" last="Ide">Ichiro Ide</name>
</author>
<author><name sortKey="Hamada, Reiko" sort="Hamada, Reiko" uniqKey="Hamada R" first="Reiko" last="Hamada">Reiko Hamada</name>
</author>
<author><name sortKey="Sakai, Shuichi" sort="Sakai, Shuichi" uniqKey="Sakai S" first="Shuichi" last="Sakai">Shuichi Sakai</name>
</author>
<author><name sortKey="Tanaka, Hidehiko" sort="Tanaka, Hidehiko" uniqKey="Tanaka H" first="Hidehiko" last="Tanaka">Hidehiko Tanaka</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:C0C0323E3979ABF70A5D71C305052E35B494F8AA</idno>
<date when="2003" year="2003">2003</date>
<idno type="doi">10.1002/scj.10417</idno>
<idno type="url">https://api.istex.fr/document/C0C0323E3979ABF70A5D71C305052E35B494F8AA/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000736</idno>
<idno type="wicri:Area/Istex/Curation">000728</idno>
<idno type="wicri:Area/Istex/Checkpoint">000F06</idno>
<idno type="wicri:doubleKey">0882-1666:2003:Ide I:compilation:of:dictionaries</idno>
<idno type="wicri:Area/Main/Merge">001810</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Compilation of dictionaries for semantic attribute analysis of television news captions</title>
<author><name sortKey="Ide, Ichiro" sort="Ide, Ichiro" uniqKey="Ide I" first="Ichiro" last="Ide">Ichiro Ide</name>
<affiliation wicri:level="3"><country xml:lang="fr">Japon</country>
<wicri:regionArea>National Institute of Informatics, Tokyo</wicri:regionArea>
<placeName><settlement type="city">Tokyo</settlement>
</placeName>
</affiliation>
</author>
<author><name sortKey="Hamada, Reiko" sort="Hamada, Reiko" uniqKey="Hamada R" first="Reiko" last="Hamada">Reiko Hamada</name>
<affiliation wicri:level="3"><country xml:lang="fr">Japon</country>
<wicri:regionArea>Graduate School of Engineering, The University of Tokyo, Tokyo</wicri:regionArea>
<placeName><settlement type="city">Tokyo</settlement>
</placeName>
</affiliation>
</author>
<author><name sortKey="Sakai, Shuichi" sort="Sakai, Shuichi" uniqKey="Sakai S" first="Shuichi" last="Sakai">Shuichi Sakai</name>
<affiliation wicri:level="3"><country xml:lang="fr">Japon</country>
<wicri:regionArea>Graduate School of Information Science and Technology, The University of Tokyo, Tokyo</wicri:regionArea>
<placeName><settlement type="city">Tokyo</settlement>
</placeName>
</affiliation>
</author>
<author><name sortKey="Tanaka, Hidehiko" sort="Tanaka, Hidehiko" uniqKey="Tanaka H" first="Hidehiko" last="Tanaka">Hidehiko Tanaka</name>
<affiliation wicri:level="3"><country xml:lang="fr">Japon</country>
<wicri:regionArea>Graduate School of Information Science and Technology, The University of Tokyo, Tokyo</wicri:regionArea>
<placeName><settlement type="city">Tokyo</settlement>
</placeName>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="j">Systems and Computers in Japan</title>
<title level="j" type="abbrev">Syst. Comp. Jpn.</title>
<idno type="ISSN">0882-1666</idno>
<idno type="eISSN">1520-684X</idno>
<imprint><publisher>Wiley Subscription Services, Inc., A Wiley Company</publisher>
<pubPlace>Hoboken</pubPlace>
<date type="published" when="2003-11-15">2003-11-15</date>
<biblScope unit="volume">34</biblScope>
<biblScope unit="issue">12</biblScope>
<biblScope unit="page" from="32">32</biblScope>
<biblScope unit="page" to="44">44</biblScope>
</imprint>
<idno type="ISSN">0882-1666</idno>
</series>
<idno type="istex">C0C0323E3979ABF70A5D71C305052E35B494F8AA</idno>
<idno type="DOI">10.1002/scj.10417</idno>
<idno type="ArticleID">SCJ10417</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0882-1666</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>caption</term>
<term>dictionary</term>
<term>indexing</term>
<term>semantic attribute</term>
<term>suffix noun</term>
</keywords>
</textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">With the increase in the amount of video that is broadcast daily, there is an increasing need for storage of video in a systematic way for future reuse and retrieval. In particular, from the viewpoint of importance and usability, it is desirable to index news videos. For adequate automatic indexing based on the text information in the video, it is not sufficient to apply the simple index extraction and annotation methods which have been widely used in conventional methods. It is important to select index candidates with reference to semantic attributes. The purpose of this study is to compile dictionaries which are needed for analyzing the semantic attributes of captions (noun phrases) in TV news videos. We describe the process by which words are extracted from text corpora and a thesaurus for storage on the basis of specified conditions. The quality of the dictionaries is examined by analysis of the semantic attributes of the words appearing in actual news videos, and the results are presented. In evaluation experiments in which an existing proper noun dictionary and temporal noun dictionary were combined and used, a recall of 79 to 93% and a precision of 41 to 71% were obtained. Although the precision is low in this result, it is concluded that the compiled dictionaries are of practical use for indexing since the recall is more important in that case. © 2003 Wiley Periodicals, Inc. Syst Comp Jpn, 34(12): 32–44, 2003; Published online in Wiley InterScience (www.interscience.wiley.com). DOI 10.1002/scj.10417</div>
</front>
</TEI>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Merge

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001810 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Main/Merge/biblio.hfd -nk 001810 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Merge
   |type=    RBID
   |clé=     ISTEX:C0C0323E3979ABF70A5D71C305052E35B494F8AA
   |texte=   Compilation of dictionaries for semantic attribute analysis of television news captions
}}

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024

	Serveur d'exploration sur l'OCR
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration sur l'OCR

Compilation of dictionaries for semantic attribute analysis of television news captions

Compilation of dictionaries for semantic attribute analysis of television news captions

Source :

English descriptors

Abstract

Links toward previous steps (curation, corpus...)

Links to Exploration step

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri