Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Segmentation of Mixed Chinese/English Document Including Scattered Italic Characters

Identifieur interne : 002815 ( Istex/Corpus ); précédent : 002814; suivant : 002816

Segmentation of Mixed Chinese/English Document Including Scattered Italic Characters

Auteurs : Yong Xia ; Chun-Heng Wang ; Ru-Wei Dai

Source :

RBID : ISTEX:DD409C0DADEED45D7BE31E2E877D68EFE748A53D

Abstract

Abstract: It is difficult to segment mixed Chinese/English documents when there are many italic characters scattered in documents. Most contributions attach more attention to English documents. However, mixed document is different from English document and some special features should be considered. This paper gives a new way to solve the problem. At first, an appropriate character area is chosen to detect italic. Next, a two-step strategy is adopted. Italic determination is done first and then if the character pattern is identified as italic, the estimation of slant angle will be done. Finally the italic character pattern is corrected by shear transform. A method of adopting two-step weighted projection profile histogram for italic determination is introduced. And a fast algorithm to estimate slant angle is also introduced. Three large sample collections, including character and character-pair and document respectively, are provided to evaluate our method and encouraging results are achieved.

Url:
DOI: 10.1007/11940098_2

Links to Exploration step

ISTEX:DD409C0DADEED45D7BE31E2E877D68EFE748A53D

Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct:series">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Segmentation of Mixed Chinese/English Document Including Scattered Italic Characters</title>
<author>
<name sortKey="Xia, Yong" sort="Xia, Yong" uniqKey="Xia Y" first="Yong" last="Xia">Yong Xia</name>
<affiliation>
<mods:affiliation>Laboratory of Complex System and Intelligence Science, Institute of Automation, Chinese Academy of Sciences, 100080, Beijing, China</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>E-mail: yong.xia@ia.ac.cn</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Wang, Chun Heng" sort="Wang, Chun Heng" uniqKey="Wang C" first="Chun-Heng" last="Wang">Chun-Heng Wang</name>
<affiliation>
<mods:affiliation>Laboratory of Complex System and Intelligence Science, Institute of Automation, Chinese Academy of Sciences, 100080, Beijing, China</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>E-mail: chunheng.wang@ia.ac.cn</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Dai, Ru Wei" sort="Dai, Ru Wei" uniqKey="Dai R" first="Ru-Wei" last="Dai">Ru-Wei Dai</name>
<affiliation>
<mods:affiliation>Laboratory of Complex System and Intelligence Science, Institute of Automation, Chinese Academy of Sciences, 100080, Beijing, China</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>E-mail: ruwei.dai@ia.ac.cn</mods:affiliation>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:DD409C0DADEED45D7BE31E2E877D68EFE748A53D</idno>
<date when="2006" year="2006">2006</date>
<idno type="doi">10.1007/11940098_2</idno>
<idno type="url">https://api.istex.fr/document/DD409C0DADEED45D7BE31E2E877D68EFE748A53D/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">002815</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">Segmentation of Mixed Chinese/English Document Including Scattered Italic Characters</title>
<author>
<name sortKey="Xia, Yong" sort="Xia, Yong" uniqKey="Xia Y" first="Yong" last="Xia">Yong Xia</name>
<affiliation>
<mods:affiliation>Laboratory of Complex System and Intelligence Science, Institute of Automation, Chinese Academy of Sciences, 100080, Beijing, China</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>E-mail: yong.xia@ia.ac.cn</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Wang, Chun Heng" sort="Wang, Chun Heng" uniqKey="Wang C" first="Chun-Heng" last="Wang">Chun-Heng Wang</name>
<affiliation>
<mods:affiliation>Laboratory of Complex System and Intelligence Science, Institute of Automation, Chinese Academy of Sciences, 100080, Beijing, China</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>E-mail: chunheng.wang@ia.ac.cn</mods:affiliation>
</affiliation>
</author>
<author>
<name sortKey="Dai, Ru Wei" sort="Dai, Ru Wei" uniqKey="Dai R" first="Ru-Wei" last="Dai">Ru-Wei Dai</name>
<affiliation>
<mods:affiliation>Laboratory of Complex System and Intelligence Science, Institute of Automation, Chinese Academy of Sciences, 100080, Beijing, China</mods:affiliation>
</affiliation>
<affiliation>
<mods:affiliation>E-mail: ruwei.dai@ia.ac.cn</mods:affiliation>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="s">Lecture Notes in Computer Science</title>
<imprint>
<date>2006</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">DD409C0DADEED45D7BE31E2E877D68EFE748A53D</idno>
<idno type="DOI">10.1007/11940098_2</idno>
<idno type="ChapterID">2</idno>
<idno type="ChapterID">Chap2</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Abstract: It is difficult to segment mixed Chinese/English documents when there are many italic characters scattered in documents. Most contributions attach more attention to English documents. However, mixed document is different from English document and some special features should be considered. This paper gives a new way to solve the problem. At first, an appropriate character area is chosen to detect italic. Next, a two-step strategy is adopted. Italic determination is done first and then if the character pattern is identified as italic, the estimation of slant angle will be done. Finally the italic character pattern is corrected by shear transform. A method of adopting two-step weighted projection profile histogram for italic determination is introduced. And a fast algorithm to estimate slant angle is also introduced. Three large sample collections, including character and character-pair and document respectively, are provided to evaluate our method and encouraging results are achieved.</div>
</front>
</TEI>
<istex>
<corpusName>springer</corpusName>
<author>
<json:item>
<name>Yong Xia</name>
<affiliations>
<json:string>Laboratory of Complex System and Intelligence Science, Institute of Automation, Chinese Academy of Sciences, 100080, Beijing, China</json:string>
<json:string>E-mail: yong.xia@ia.ac.cn</json:string>
</affiliations>
</json:item>
<json:item>
<name>Chun-Heng Wang</name>
<affiliations>
<json:string>Laboratory of Complex System and Intelligence Science, Institute of Automation, Chinese Academy of Sciences, 100080, Beijing, China</json:string>
<json:string>E-mail: chunheng.wang@ia.ac.cn</json:string>
</affiliations>
</json:item>
<json:item>
<name>Ru-Wei Dai</name>
<affiliations>
<json:string>Laboratory of Complex System and Intelligence Science, Institute of Automation, Chinese Academy of Sciences, 100080, Beijing, China</json:string>
<json:string>E-mail: ruwei.dai@ia.ac.cn</json:string>
</affiliations>
</json:item>
</author>
<language>
<json:string>eng</json:string>
</language>
<abstract>Abstract: It is difficult to segment mixed Chinese/English documents when there are many italic characters scattered in documents. Most contributions attach more attention to English documents. However, mixed document is different from English document and some special features should be considered. This paper gives a new way to solve the problem. At first, an appropriate character area is chosen to detect italic. Next, a two-step strategy is adopted. Italic determination is done first and then if the character pattern is identified as italic, the estimation of slant angle will be done. Finally the italic character pattern is corrected by shear transform. A method of adopting two-step weighted projection profile histogram for italic determination is introduced. And a fast algorithm to estimate slant angle is also introduced. Three large sample collections, including character and character-pair and document respectively, are provided to evaluate our method and encouraging results are achieved.</abstract>
<qualityIndicators>
<score>4.67</score>
<pdfVersion>1.3</pdfVersion>
<pdfPageSize>430 x 660 pts</pdfPageSize>
<refBibsNative>false</refBibsNative>
<keywordCount>0</keywordCount>
<abstractCharCount>1008</abstractCharCount>
<pdfWordCount>2894</pdfWordCount>
<pdfCharCount>17218</pdfCharCount>
<pdfPageCount>9</pdfPageCount>
<abstractWordCount>148</abstractWordCount>
</qualityIndicators>
<title>Segmentation of Mixed Chinese/English Document Including Scattered Italic Characters</title>
<genre.original>
<json:string>OriginalPaper</json:string>
</genre.original>
<chapterId>
<json:string>2</json:string>
<json:string>Chap2</json:string>
</chapterId>
<genre>
<json:string>conference [eBooks]</json:string>
</genre>
<serie>
<editor>
<json:item>
<name>David Hutchison</name>
<affiliations>
<json:string>Lancaster University, UK</json:string>
</affiliations>
</json:item>
<json:item>
<name>Takeo Kanade</name>
<affiliations>
<json:string>Carnegie Mellon University, Pittsburgh, PA, USA</json:string>
</affiliations>
</json:item>
<json:item>
<name>Josef Kittler</name>
<affiliations>
<json:string>University of Surrey, Guildford, UK</json:string>
</affiliations>
</json:item>
<json:item>
<name>Jon M. Kleinberg</name>
<affiliations>
<json:string>Cornell University, Ithaca, NY, USA</json:string>
</affiliations>
</json:item>
<json:item>
<name>Friedemann Mattern</name>
<affiliations>
<json:string>ETH Zurich, Switzerland</json:string>
</affiliations>
</json:item>
<json:item>
<name>John C. Mitchell</name>
<affiliations>
<json:string>Stanford University, CA, USA</json:string>
</affiliations>
</json:item>
<json:item>
<name>Moni Naor</name>
<affiliations>
<json:string>Weizmann Institute of Science, Rehovot, Israel</json:string>
</affiliations>
</json:item>
<json:item>
<name>Oscar Nierstrasz</name>
<affiliations>
<json:string>University of Bern, Switzerland</json:string>
</affiliations>
</json:item>
<json:item>
<name>C. Pandu Rangan</name>
<affiliations>
<json:string>Indian Institute of Technology, Madras, India</json:string>
</affiliations>
</json:item>
<json:item>
<name>Bernhard Steffen</name>
<affiliations>
<json:string>University of Dortmund, Germany</json:string>
</affiliations>
</json:item>
<json:item>
<name>Madhu Sudan</name>
<affiliations>
<json:string>Massachusetts Institute of Technology, MA, USA</json:string>
</affiliations>
</json:item>
<json:item>
<name>Demetri Terzopoulos</name>
<affiliations>
<json:string>University of California, Los Angeles, CA, USA</json:string>
</affiliations>
</json:item>
<json:item>
<name>Dough Tygar</name>
<affiliations>
<json:string>University of California, Berkeley, CA, USA</json:string>
</affiliations>
</json:item>
<json:item>
<name>Moshe Y. Vardi</name>
<affiliations>
<json:string>Rice University, Houston, TX, USA</json:string>
</affiliations>
</json:item>
<json:item>
<name>Gerhard Weikum</name>
<affiliations>
<json:string>Max-Planck Institute of Computer Science, Saarbruecken, Germany</json:string>
</affiliations>
</json:item>
</editor>
<issn>
<json:string>0302-9743</json:string>
</issn>
<language>
<json:string>unknown</json:string>
</language>
<eissn>
<json:string>1611-3349</json:string>
</eissn>
<title>Lecture Notes in Computer Science</title>
<copyrightDate>2006</copyrightDate>
</serie>
<host>
<editor>
<json:item>
<name>Yuji Matsumoto</name>
<affiliations>
<json:string>Graduate School of Information Science, Nara Institute of Science and Technology, 630-0192, Takayama, Ikoma, Nara, Japan</json:string>
<json:string>E-mail: matsu@is.naist.jp</json:string>
</affiliations>
</json:item>
<json:item>
<name>Richard W. Sproat</name>
<affiliations>
<json:string>Dept of ECE, University of Illinois at Urbana Champaign, IL 61801, Urbana, USA</json:string>
<json:string>E-mail: rws@xoba.com</json:string>
</affiliations>
</json:item>
<json:item>
<name>Kam-Fai Wong</name>
<affiliations>
<json:string>Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong</json:string>
<json:string>E-mail: kfwong@se.cuhk.edu.hk</json:string>
</affiliations>
</json:item>
<json:item>
<name>Min Zhang</name>
<affiliations>
<json:string>State Key Lab of Intelligent Tech. & Sys., Tsinghua University,</json:string>
<json:string>E-mail: miz14@pitt.edu</json:string>
</affiliations>
</json:item>
</editor>
<subject>
<json:item>
<value>Computer Science</value>
</json:item>
<json:item>
<value>Computer Science</value>
</json:item>
<json:item>
<value>Artificial Intelligence (incl. Robotics)</value>
</json:item>
<json:item>
<value>Mathematical Logic and Formal Languages</value>
</json:item>
<json:item>
<value>Language Translation and Linguistics</value>
</json:item>
<json:item>
<value>Data Mining and Knowledge Discovery</value>
</json:item>
<json:item>
<value>Algorithm Analysis and Problem Complexity</value>
</json:item>
<json:item>
<value>Document Preparation and Text Processing</value>
</json:item>
</subject>
<isbn>
<json:string>978-3-540-49667-0</json:string>
</isbn>
<language>
<json:string>unknown</json:string>
</language>
<eissn>
<json:string>1611-3349</json:string>
</eissn>
<title>Computer Processing of Oriental Languages. Beyond the Orient: The Research Challenges Ahead</title>
<genre.original>
<json:string>Proceedings</json:string>
</genre.original>
<bookId>
<json:string>978-3-540-49668-7</json:string>
</bookId>
<volume>4285</volume>
<pages>
<last>21</last>
<first>13</first>
</pages>
<issn>
<json:string>0302-9743</json:string>
</issn>
<genre>
<json:string>Book Series</json:string>
</genre>
<eisbn>
<json:string>978-3-540-49668-7</json:string>
</eisbn>
<copyrightDate>2006</copyrightDate>
<doi>
<json:string>10.1007/11940098</json:string>
</doi>
</host>
<publicationDate>2006</publicationDate>
<copyrightDate>2006</copyrightDate>
<doi>
<json:string>10.1007/11940098_2</json:string>
</doi>
<id>DD409C0DADEED45D7BE31E2E877D68EFE748A53D</id>
<fulltext>
<json:item>
<original>true</original>
<mimetype>application/pdf</mimetype>
<extension>pdf</extension>
<uri>https://api.istex.fr/document/DD409C0DADEED45D7BE31E2E877D68EFE748A53D/fulltext/pdf</uri>
</json:item>
<json:item>
<original>false</original>
<mimetype>application/zip</mimetype>
<extension>zip</extension>
<uri>https://api.istex.fr/document/DD409C0DADEED45D7BE31E2E877D68EFE748A53D/fulltext/zip</uri>
</json:item>
<istex:fulltextTEI uri="https://api.istex.fr/document/DD409C0DADEED45D7BE31E2E877D68EFE748A53D/fulltext/tei">
<teiHeader>
<fileDesc>
<titleStmt>
<title level="a" type="main" xml:lang="en">Segmentation of Mixed Chinese/English Document Including Scattered Italic Characters</title>
<respStmt xml:id="ISTEX-API" resp="Références bibliographiques récupérées via GROBID" name="ISTEX-API (INIST-CNRS)"></respStmt>
</titleStmt>
<publicationStmt>
<authority>ISTEX</authority>
<publisher>Springer Berlin Heidelberg</publisher>
<pubPlace>Berlin, Heidelberg</pubPlace>
<availability>
<p>SPRINGER</p>
</availability>
<date>2006</date>
</publicationStmt>
<sourceDesc>
<biblStruct type="inbook">
<analytic>
<title level="a" type="main" xml:lang="en">Segmentation of Mixed Chinese/English Document Including Scattered Italic Characters</title>
<author>
<persName>
<forename type="first">Yong</forename>
<surname>Xia</surname>
</persName>
<email>yong.xia@ia.ac.cn</email>
<affiliation>Laboratory of Complex System and Intelligence Science, Institute of Automation, Chinese Academy of Sciences, 100080, Beijing, China</affiliation>
</author>
<author>
<persName>
<forename type="first">Chun-Heng</forename>
<surname>Wang</surname>
</persName>
<email>chunheng.wang@ia.ac.cn</email>
<affiliation>Laboratory of Complex System and Intelligence Science, Institute of Automation, Chinese Academy of Sciences, 100080, Beijing, China</affiliation>
</author>
<author>
<persName>
<forename type="first">Ru-Wei</forename>
<surname>Dai</surname>
</persName>
<email>ruwei.dai@ia.ac.cn</email>
<affiliation>Laboratory of Complex System and Intelligence Science, Institute of Automation, Chinese Academy of Sciences, 100080, Beijing, China</affiliation>
</author>
</analytic>
<monogr>
<title level="m">Computer Processing of Oriental Languages. Beyond the Orient: The Research Challenges Ahead</title>
<title level="m" type="sub">21st International Conference, ICCPOL 2006, Singapore, December 17-19, 2006. Proceedings</title>
<idno type="pISBN">978-3-540-49667-0</idno>
<idno type="eISBN">978-3-540-49668-7</idno>
<idno type="pISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="DOI">10.1007/11940098</idno>
<idno type="BookID">978-3-540-49668-7</idno>
<idno type="BookTitleID">142982</idno>
<idno type="BookSequenceNumber">4285</idno>
<idno type="BookVolumeNumber">4285</idno>
<idno type="BookChapterCount">56</idno>
<editor>
<persName>
<forename type="first">Yuji</forename>
<surname>Matsumoto</surname>
</persName>
<email>matsu@is.naist.jp</email>
<affiliation>Graduate School of Information Science, Nara Institute of Science and Technology, 630-0192, Takayama, Ikoma, Nara, Japan</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Richard</forename>
<forename type="first">W.</forename>
<surname>Sproat</surname>
</persName>
<email>rws@xoba.com</email>
<affiliation>Dept of ECE, University of Illinois at Urbana Champaign, IL 61801, Urbana, USA</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Kam-Fai</forename>
<surname>Wong</surname>
</persName>
<email>kfwong@se.cuhk.edu.hk</email>
<affiliation>Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Min</forename>
<surname>Zhang</surname>
</persName>
<email>miz14@pitt.edu</email>
<affiliation>State Key Lab of Intelligent Tech. & Sys., Tsinghua University,</affiliation>
</editor>
<imprint>
<publisher>Springer Berlin Heidelberg</publisher>
<pubPlace>Berlin, Heidelberg</pubPlace>
<date type="published" when="2006"></date>
<biblScope unit="volume">4285</biblScope>
<biblScope unit="page" from="13">13</biblScope>
<biblScope unit="page" to="21">21</biblScope>
</imprint>
</monogr>
<series>
<title level="s">Lecture Notes in Computer Science</title>
<editor>
<persName>
<forename type="first">David</forename>
<surname>Hutchison</surname>
</persName>
<affiliation>Lancaster University, UK</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Takeo</forename>
<surname>Kanade</surname>
</persName>
<affiliation>Carnegie Mellon University, Pittsburgh, PA, USA</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Josef</forename>
<surname>Kittler</surname>
</persName>
<affiliation>University of Surrey, Guildford, UK</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Jon</forename>
<forename type="first">M.</forename>
<surname>Kleinberg</surname>
</persName>
<affiliation>Cornell University, Ithaca, NY, USA</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Friedemann</forename>
<surname>Mattern</surname>
</persName>
<affiliation>ETH Zurich, Switzerland</affiliation>
</editor>
<editor>
<persName>
<forename type="first">John</forename>
<forename type="first">C.</forename>
<surname>Mitchell</surname>
</persName>
<affiliation>Stanford University, CA, USA</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Moni</forename>
<surname>Naor</surname>
</persName>
<affiliation>Weizmann Institute of Science, Rehovot, Israel</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Oscar</forename>
<surname>Nierstrasz</surname>
</persName>
<affiliation>University of Bern, Switzerland</affiliation>
</editor>
<editor>
<persName>
<forename type="first">C.</forename>
<surname>Pandu Rangan</surname>
</persName>
<affiliation>Indian Institute of Technology, Madras, India</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Bernhard</forename>
<surname>Steffen</surname>
</persName>
<affiliation>University of Dortmund, Germany</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Madhu</forename>
<surname>Sudan</surname>
</persName>
<affiliation>Massachusetts Institute of Technology, MA, USA</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Demetri</forename>
<surname>Terzopoulos</surname>
</persName>
<affiliation>University of California, Los Angeles, CA, USA</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Dough</forename>
<surname>Tygar</surname>
</persName>
<affiliation>University of California, Berkeley, CA, USA</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Moshe</forename>
<forename type="first">Y.</forename>
<surname>Vardi</surname>
</persName>
<affiliation>Rice University, Houston, TX, USA</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Gerhard</forename>
<surname>Weikum</surname>
</persName>
<affiliation>Max-Planck Institute of Computer Science, Saarbruecken, Germany</affiliation>
</editor>
<biblScope>
<date>2006</date>
</biblScope>
<idno type="pISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="seriesId">558</idno>
</series>
<series>
<title level="s">Lecture Notes in Artificial Intelligence</title>
<editor>
<persName>
<forename type="first">David</forename>
<surname>Hutchison</surname>
</persName>
<affiliation>Lancaster University, UK</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Takeo</forename>
<surname>Kanade</surname>
</persName>
<affiliation>Carnegie Mellon University, Pittsburgh, PA, USA</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Josef</forename>
<surname>Kittler</surname>
</persName>
<affiliation>University of Surrey, Guildford, UK</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Jon</forename>
<forename type="first">M.</forename>
<surname>Kleinberg</surname>
</persName>
<affiliation>Cornell University, Ithaca, NY, USA</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Friedemann</forename>
<surname>Mattern</surname>
</persName>
<affiliation>ETH Zurich, Switzerland</affiliation>
</editor>
<editor>
<persName>
<forename type="first">John</forename>
<forename type="first">C.</forename>
<surname>Mitchell</surname>
</persName>
<affiliation>Stanford University, CA, USA</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Moni</forename>
<surname>Naor</surname>
</persName>
<affiliation>Weizmann Institute of Science, Rehovot, Israel</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Oscar</forename>
<surname>Nierstrasz</surname>
</persName>
<affiliation>University of Bern, Switzerland</affiliation>
</editor>
<editor>
<persName>
<forename type="first">C.</forename>
<surname>Pandu Rangan</surname>
</persName>
<affiliation>Indian Institute of Technology, Madras, India</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Bernhard</forename>
<surname>Steffen</surname>
</persName>
<affiliation>University of Dortmund, Germany</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Madhu</forename>
<surname>Sudan</surname>
</persName>
<affiliation>Massachusetts Institute of Technology, MA, USA</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Demetri</forename>
<surname>Terzopoulos</surname>
</persName>
<affiliation>University of California, Los Angeles, CA, USA</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Dough</forename>
<surname>Tygar</surname>
</persName>
<affiliation>University of California, Berkeley, CA, USA</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Moshe</forename>
<forename type="first">Y.</forename>
<surname>Vardi</surname>
</persName>
<affiliation>Rice University, Houston, TX, USA</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Gerhard</forename>
<surname>Weikum</surname>
</persName>
<affiliation>Max-Planck Institute of Computer Science, Saarbruecken, Germany</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Jaime</forename>
<forename type="first">G.</forename>
<surname>Carbonell</surname>
</persName>
<affiliation>Carnegie Mellon University, Pittsburgh, PA, USA</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Jörg</forename>
<surname>Siekmann</surname>
</persName>
<affiliation>University of Saarland, Saarbrücken, Germany</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Yuji</forename>
<surname>Matsumoto</surname>
</persName>
<email>matsu@is.naist.jp</email>
<affiliation>Graduate School of Information Science, Nara Institute of Science and Technology, 630-0192, Takayama, Ikoma, Nara, Japan</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Richard</forename>
<forename type="first">W.</forename>
<surname>Sproat</surname>
</persName>
<email>rws@xoba.com</email>
<affiliation>Dept of ECE, University of Illinois at Urbana Champaign, IL 61801, Urbana, USA</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Kam-Fai</forename>
<surname>Wong</surname>
</persName>
<email>kfwong@se.cuhk.edu.hk</email>
<affiliation>Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong</affiliation>
</editor>
<editor>
<persName>
<forename type="first">Min</forename>
<surname>Zhang</surname>
</persName>
<email>miz14@pitt.edu</email>
<affiliation>State Key Lab of Intelligent Tech. & Sys., Tsinghua University,</affiliation>
</editor>
<idno type="pISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<biblScope type="seriesId">1244</biblScope>
</series>
<idno type="istex">DD409C0DADEED45D7BE31E2E877D68EFE748A53D</idno>
<idno type="DOI">10.1007/11940098_2</idno>
<idno type="ChapterID">2</idno>
<idno type="ChapterID">Chap2</idno>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<creation>
<date>2006</date>
</creation>
<langUsage>
<language ident="en">en</language>
</langUsage>
<abstract xml:lang="en">
<p>Abstract: It is difficult to segment mixed Chinese/English documents when there are many italic characters scattered in documents. Most contributions attach more attention to English documents. However, mixed document is different from English document and some special features should be considered. This paper gives a new way to solve the problem. At first, an appropriate character area is chosen to detect italic. Next, a two-step strategy is adopted. Italic determination is done first and then if the character pattern is identified as italic, the estimation of slant angle will be done. Finally the italic character pattern is corrected by shear transform. A method of adopting two-step weighted projection profile histogram for italic determination is introduced. And a fast algorithm to estimate slant angle is also introduced. Three large sample collections, including character and character-pair and document respectively, are provided to evaluate our method and encouraging results are achieved.</p>
</abstract>
<textClass>
<keywords scheme="Book Subject Collection">
<list>
<label>SUCO11645</label>
<item>
<term>Computer Science</term>
</item>
</list>
</keywords>
</textClass>
<textClass>
<keywords scheme="Book Subject Group">
<list>
<label>I</label>
<label>I21017</label>
<label>I16048</label>
<label>I21041</label>
<label>I18030</label>
<label>I16021</label>
<label>I21033</label>
<item>
<term>Computer Science</term>
</item>
<item>
<term>Artificial Intelligence (incl. Robotics)</term>
</item>
<item>
<term>Mathematical Logic and Formal Languages</term>
</item>
<item>
<term>Language Translation and Linguistics</term>
</item>
<item>
<term>Data Mining and Knowledge Discovery</term>
</item>
<item>
<term>Algorithm Analysis and Problem Complexity</term>
</item>
<item>
<term>Document Preparation and Text Processing</term>
</item>
</list>
</keywords>
</textClass>
</profileDesc>
<revisionDesc>
<change when="2006">Published</change>
<change xml:id="refBibs-istex" who="#ISTEX-API" when="2016-3-20">References added</change>
</revisionDesc>
</teiHeader>
</istex:fulltextTEI>
<json:item>
<original>false</original>
<mimetype>text/plain</mimetype>
<extension>txt</extension>
<uri>https://api.istex.fr/document/DD409C0DADEED45D7BE31E2E877D68EFE748A53D/fulltext/txt</uri>
</json:item>
</fulltext>
<metadata>
<istex:metadataXml wicri:clean="Springer, Publisher found" wicri:toSee="no header">
<istex:xmlDeclaration>version="1.0" encoding="UTF-8"</istex:xmlDeclaration>
<istex:docType PUBLIC="-//Springer-Verlag//DTD A++ V2.4//EN" URI="http://devel.springer.de/A++/V2.4/DTD/A++V2.4.dtd" name="istex:docType"></istex:docType>
<istex:document>
<Publisher>
<PublisherInfo>
<PublisherName>Springer Berlin Heidelberg</PublisherName>
<PublisherLocation>Berlin, Heidelberg</PublisherLocation>
</PublisherInfo>
<Series>
<SeriesInfo SeriesType="Series" TocLevels="0">
<SeriesID>558</SeriesID>
<SeriesPrintISSN>0302-9743</SeriesPrintISSN>
<SeriesElectronicISSN>1611-3349</SeriesElectronicISSN>
<SeriesTitle Language="En">Lecture Notes in Computer Science</SeriesTitle>
</SeriesInfo>
<SeriesHeader>
<EditorGroup>
<Editor AffiliationIDS="Aff1">
<EditorName DisplayOrder="Western">
<GivenName>David</GivenName>
<FamilyName>Hutchison</FamilyName>
</EditorName>
</Editor>
<Editor AffiliationIDS="Aff2">
<EditorName DisplayOrder="Western">
<GivenName>Takeo</GivenName>
<FamilyName>Kanade</FamilyName>
</EditorName>
</Editor>
<Editor AffiliationIDS="Aff3">
<EditorName DisplayOrder="Western">
<GivenName>Josef</GivenName>
<FamilyName>Kittler</FamilyName>
</EditorName>
</Editor>
<Editor AffiliationIDS="Aff4">
<EditorName DisplayOrder="Western">
<GivenName>Jon</GivenName>
<GivenName>M.</GivenName>
<FamilyName>Kleinberg</FamilyName>
</EditorName>
</Editor>
<Editor AffiliationIDS="Aff5">
<EditorName DisplayOrder="Western">
<GivenName>Friedemann</GivenName>
<FamilyName>Mattern</FamilyName>
</EditorName>
</Editor>
<Editor AffiliationIDS="Aff6">
<EditorName DisplayOrder="Western">
<GivenName>John</GivenName>
<GivenName>C.</GivenName>
<FamilyName>Mitchell</FamilyName>
</EditorName>
</Editor>
<Editor AffiliationIDS="Aff7">
<EditorName DisplayOrder="Western">
<GivenName>Moni</GivenName>
<FamilyName>Naor</FamilyName>
</EditorName>
</Editor>
<Editor AffiliationIDS="Aff8">
<EditorName DisplayOrder="Western">
<GivenName>Oscar</GivenName>
<FamilyName>Nierstrasz</FamilyName>
</EditorName>
</Editor>
<Editor AffiliationIDS="Aff9">
<EditorName DisplayOrder="Western">
<GivenName>C.</GivenName>
<FamilyName>Pandu Rangan</FamilyName>
</EditorName>
</Editor>
<Editor AffiliationIDS="Aff10">
<EditorName DisplayOrder="Western">
<GivenName>Bernhard</GivenName>
<FamilyName>Steffen</FamilyName>
</EditorName>
</Editor>
<Editor AffiliationIDS="Aff11">
<EditorName DisplayOrder="Western">
<GivenName>Madhu</GivenName>
<FamilyName>Sudan</FamilyName>
</EditorName>
</Editor>
<Editor AffiliationIDS="Aff12">
<EditorName DisplayOrder="Western">
<GivenName>Demetri</GivenName>
<FamilyName>Terzopoulos</FamilyName>
</EditorName>
</Editor>
<Editor AffiliationIDS="Aff13">
<EditorName DisplayOrder="Western">
<GivenName>Dough</GivenName>
<FamilyName>Tygar</FamilyName>
</EditorName>
</Editor>
<Editor AffiliationIDS="Aff14">
<EditorName DisplayOrder="Western">
<GivenName>Moshe</GivenName>
<GivenName>Y.</GivenName>
<FamilyName>Vardi</FamilyName>
</EditorName>
</Editor>
<Editor AffiliationIDS="Aff15">
<EditorName DisplayOrder="Western">
<GivenName>Gerhard</GivenName>
<FamilyName>Weikum</FamilyName>
</EditorName>
</Editor>
<Affiliation ID="Aff1">
<OrgName>Lancaster University</OrgName>
<OrgAddress>
<Country>UK</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff2">
<OrgName>Carnegie Mellon University</OrgName>
<OrgAddress>
<City>Pittsburgh</City>
<State>PA</State>
<Country>USA</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff3">
<OrgName>University of Surrey</OrgName>
<OrgAddress>
<City>Guildford</City>
<Country>UK</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff4">
<OrgName>Cornell University</OrgName>
<OrgAddress>
<City>Ithaca</City>
<State>NY</State>
<Country>USA</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff5">
<OrgName>ETH Zurich</OrgName>
<OrgAddress>
<Country>Switzerland</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff6">
<OrgName>Stanford University</OrgName>
<OrgAddress>
<City>CA</City>
<Country>USA</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff7">
<OrgName>Weizmann Institute of Science</OrgName>
<OrgAddress>
<City>Rehovot</City>
<Country>Israel</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff8">
<OrgName>University of Bern</OrgName>
<OrgAddress>
<Country>Switzerland</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff9">
<OrgName>Indian Institute of Technology</OrgName>
<OrgAddress>
<City>Madras</City>
<Country>India</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff10">
<OrgName>University of Dortmund</OrgName>
<OrgAddress>
<Country>Germany</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff11">
<OrgName>Massachusetts Institute of Technology</OrgName>
<OrgAddress>
<City>MA</City>
<Country>USA</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff12">
<OrgName>University of California</OrgName>
<OrgAddress>
<City>Los Angeles</City>
<State>CA</State>
<Country>USA</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff13">
<OrgName>University of California</OrgName>
<OrgAddress>
<City>Berkeley</City>
<State>CA</State>
<Country>USA</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff14">
<OrgName>Rice University</OrgName>
<OrgAddress>
<City>Houston</City>
<State>TX</State>
<Country>USA</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff15">
<OrgName>Max-Planck Institute of Computer Science</OrgName>
<OrgAddress>
<City>Saarbruecken</City>
<Country>Germany</Country>
</OrgAddress>
</Affiliation>
</EditorGroup>
</SeriesHeader>
<SubSeries>
<SubSeriesInfo>
<SubSeriesID>1244</SubSeriesID>
<SubSeriesPrintISSN>0302-9743</SubSeriesPrintISSN>
<SubSeriesElectronicISSN>1611-3349</SubSeriesElectronicISSN>
<SubSeriesTitle Language="En">Lecture Notes in Artificial Intelligence</SubSeriesTitle>
</SubSeriesInfo>
<SubSeriesHeader>
<EditorGroup>
<Editor AffiliationIDS="Aff16">
<EditorName DisplayOrder="Western">
<GivenName>Jaime</GivenName>
<GivenName>G.</GivenName>
<FamilyName>Carbonell</FamilyName>
</EditorName>
</Editor>
<Editor AffiliationIDS="Aff17">
<EditorName DisplayOrder="Western">
<GivenName>Jörg</GivenName>
<FamilyName>Siekmann</FamilyName>
</EditorName>
</Editor>
<Affiliation ID="Aff16">
<OrgName>Carnegie Mellon University</OrgName>
<OrgAddress>
<City>Pittsburgh</City>
<State>PA</State>
<Country>USA</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff17">
<OrgName>University of Saarland</OrgName>
<OrgAddress>
<City>Saarbrücken</City>
<Country>Germany</Country>
</OrgAddress>
</Affiliation>
</EditorGroup>
</SubSeriesHeader>
</SubSeries>
<Book Language="En">
<BookInfo BookProductType="Proceedings" ContainsESM="No" Language="En" MediaType="eBook" NumberingDepth="2" NumberingStyle="ContentOnly" OutputMedium="All" TocLevels="0">
<BookID>978-3-540-49668-7</BookID>
<BookTitle>Computer Processing of Oriental Languages. Beyond the Orient: The Research Challenges Ahead</BookTitle>
<BookSubTitle>21st International Conference, ICCPOL 2006, Singapore, December 17-19, 2006. Proceedings</BookSubTitle>
<BookVolumeNumber>4285</BookVolumeNumber>
<BookSequenceNumber>4285</BookSequenceNumber>
<BookDOI>10.1007/11940098</BookDOI>
<BookTitleID>142982</BookTitleID>
<BookPrintISBN>978-3-540-49667-0</BookPrintISBN>
<BookElectronicISBN>978-3-540-49668-7</BookElectronicISBN>
<BookChapterCount>56</BookChapterCount>
<BookCopyright>
<CopyrightHolderName>Springer-Verlag Berlin Heidelberg</CopyrightHolderName>
<CopyrightYear>2006</CopyrightYear>
</BookCopyright>
<BookSubjectGroup>
<BookSubject Code="I" Type="Primary">Computer Science</BookSubject>
<BookSubject Code="I21017" Priority="1" Type="Secondary">Artificial Intelligence (incl. Robotics)</BookSubject>
<BookSubject Code="I16048" Priority="2" Type="Secondary">Mathematical Logic and Formal Languages</BookSubject>
<BookSubject Code="I21041" Priority="3" Type="Secondary">Language Translation and Linguistics</BookSubject>
<BookSubject Code="I18030" Priority="4" Type="Secondary">Data Mining and Knowledge Discovery</BookSubject>
<BookSubject Code="I16021" Priority="5" Type="Secondary">Algorithm Analysis and Problem Complexity</BookSubject>
<BookSubject Code="I21033" Priority="6" Type="Secondary">Document Preparation and Text Processing</BookSubject>
<SubjectCollection Code="SUCO11645">Computer Science</SubjectCollection>
</BookSubjectGroup>
<BookContext>
<SeriesID>558</SeriesID>
<SubSeriesID>1244</SubSeriesID>
</BookContext>
</BookInfo>
<BookHeader>
<EditorGroup>
<Editor AffiliationIDS="Aff18">
<EditorName DisplayOrder="Western">
<GivenName>Yuji</GivenName>
<FamilyName>Matsumoto</FamilyName>
</EditorName>
<Contact>
<Email>matsu@is.naist.jp</Email>
</Contact>
</Editor>
<Editor AffiliationIDS="Aff19">
<EditorName DisplayOrder="Western">
<GivenName>Richard</GivenName>
<GivenName>W.</GivenName>
<FamilyName>Sproat</FamilyName>
</EditorName>
<Contact>
<Email>rws@xoba.com</Email>
</Contact>
</Editor>
<Editor AffiliationIDS="Aff20">
<EditorName DisplayOrder="Western">
<GivenName>Kam-Fai</GivenName>
<FamilyName>Wong</FamilyName>
</EditorName>
<Contact>
<Email>kfwong@se.cuhk.edu.hk</Email>
</Contact>
</Editor>
<Editor AffiliationIDS="Aff21">
<EditorName DisplayOrder="Western">
<GivenName>Min</GivenName>
<FamilyName>Zhang</FamilyName>
</EditorName>
<Contact>
<Email>miz14@pitt.edu</Email>
</Contact>
</Editor>
<Affiliation ID="Aff18">
<OrgDivision>Graduate School of Information Science</OrgDivision>
<OrgName>Nara Institute of Science and Technology</OrgName>
<OrgAddress>
<Postcode>630-0192</Postcode>
<City>Takayama, Ikoma, Nara</City>
<Country>Japan</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff19">
<OrgDivision>Dept of ECE</OrgDivision>
<OrgName>University of Illinois at Urbana Champaign</OrgName>
<OrgAddress>
<Postcode>IL 61801</Postcode>
<City>Urbana</City>
<Country>USA</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff20">
<OrgDivision>Department of Systems Engineering and Engineering Management</OrgDivision>
<OrgName>The Chinese University of Hong Kong</OrgName>
<OrgAddress>
<City>Shatin, N.T., Hong Kong</City>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff21">
<OrgDivision>State Key Lab of Intelligent Tech. & Sys.</OrgDivision>
<OrgName>Tsinghua University</OrgName>
<OrgAddress>
<Country> </Country>
</OrgAddress>
</Affiliation>
</EditorGroup>
</BookHeader>
<Part ID="Part1">
<PartInfo TocLevels="0">
<PartID>1</PartID>
<PartSequenceNumber>1</PartSequenceNumber>
<PartTitle>Information Retrieval/Document Classification/QA/ Summarization I</PartTitle>
<PartChapterCount>5</PartChapterCount>
<PartContext>
<SeriesID>558</SeriesID>
<BookTitle>Computer Processing of Oriental Languages</BookTitle>
</PartContext>
</PartInfo>
<Chapter ID="Chap2" Language="En">
<ChapterInfo ChapterType="OriginalPaper" ContainsESM="No" NumberingDepth="2" NumberingStyle="ContentOnly" TocLevels="0">
<ChapterID>2</ChapterID>
<ChapterDOI>10.1007/11940098_2</ChapterDOI>
<ChapterSequenceNumber>2</ChapterSequenceNumber>
<ChapterTitle Language="En">Segmentation of Mixed Chinese/English Document Including Scattered Italic Characters</ChapterTitle>
<ChapterFirstPage>13</ChapterFirstPage>
<ChapterLastPage>21</ChapterLastPage>
<ChapterCopyright>
<CopyrightHolderName>Springer-Verlag Berlin Heidelberg</CopyrightHolderName>
<CopyrightYear>2006</CopyrightYear>
</ChapterCopyright>
<ChapterGrants Type="Regular">
<MetadataGrant Grant="OpenAccess"></MetadataGrant>
<AbstractGrant Grant="OpenAccess"></AbstractGrant>
<BodyPDFGrant Grant="Restricted"></BodyPDFGrant>
<BodyHTMLGrant Grant="Restricted"></BodyHTMLGrant>
<BibliographyGrant Grant="Restricted"></BibliographyGrant>
<ESMGrant Grant="Restricted"></ESMGrant>
</ChapterGrants>
<ChapterContext>
<SeriesID>558</SeriesID>
<PartID>1</PartID>
<BookID>978-3-540-49668-7</BookID>
<BookTitle>Computer Processing of Oriental Languages</BookTitle>
</ChapterContext>
</ChapterInfo>
<ChapterHeader>
<AuthorGroup>
<Author AffiliationIDS="Aff22">
<AuthorName DisplayOrder="Western">
<GivenName>Yong</GivenName>
<FamilyName>Xia</FamilyName>
</AuthorName>
<Contact>
<Email>yong.xia@ia.ac.cn</Email>
</Contact>
</Author>
<Author AffiliationIDS="Aff22">
<AuthorName DisplayOrder="Western">
<GivenName>Chun-Heng</GivenName>
<FamilyName>Wang</FamilyName>
</AuthorName>
<Contact>
<Email>chunheng.wang@ia.ac.cn</Email>
</Contact>
</Author>
<Author AffiliationIDS="Aff22">
<AuthorName DisplayOrder="Western">
<GivenName>Ru-Wei</GivenName>
<FamilyName>Dai</FamilyName>
</AuthorName>
<Contact>
<Email>ruwei.dai@ia.ac.cn</Email>
</Contact>
</Author>
<Affiliation ID="Aff22">
<OrgDivision>Laboratory of Complex System and Intelligence Science, Institute of Automation</OrgDivision>
<OrgName>Chinese Academy of Sciences</OrgName>
<OrgAddress>
<City>Beijing</City>
<Postcode>100080</Postcode>
<Country>China</Country>
</OrgAddress>
</Affiliation>
</AuthorGroup>
<Abstract ID="Abs1" Language="En">
<Heading>Abstract</Heading>
<Para>It is difficult to segment mixed Chinese/English documents when there are many italic characters scattered in documents. Most contributions attach more attention to English documents. However, mixed document is different from English document and some special features should be considered. This paper gives a new way to solve the problem. At first, an appropriate character area is chosen to detect italic. Next, a two-step strategy is adopted. Italic determination is done first and then if the character pattern is identified as italic, the estimation of slant angle will be done. Finally the italic character pattern is corrected by shear transform. A method of adopting two-step weighted projection profile histogram for italic determination is introduced. And a fast algorithm to estimate slant angle is also introduced. Three large sample collections, including character and character-pair and document respectively, are provided to evaluate our method and encouraging results are achieved.</Para>
</Abstract>
</ChapterHeader>
<NoBody></NoBody>
</Chapter>
</Part>
</Book>
</Series>
</Publisher>
</istex:document>
</istex:metadataXml>
<mods version="3.6">
<titleInfo lang="en">
<title>Segmentation of Mixed Chinese/English Document Including Scattered Italic Characters</title>
</titleInfo>
<titleInfo type="alternative" contentType="CDATA" lang="en">
<title>Segmentation of Mixed Chinese/English Document Including Scattered Italic Characters</title>
</titleInfo>
<name type="personal">
<namePart type="given">Yong</namePart>
<namePart type="family">Xia</namePart>
<affiliation>Laboratory of Complex System and Intelligence Science, Institute of Automation, Chinese Academy of Sciences, 100080, Beijing, China</affiliation>
<affiliation>E-mail: yong.xia@ia.ac.cn</affiliation>
<role>
<roleTerm type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Chun-Heng</namePart>
<namePart type="family">Wang</namePart>
<affiliation>Laboratory of Complex System and Intelligence Science, Institute of Automation, Chinese Academy of Sciences, 100080, Beijing, China</affiliation>
<affiliation>E-mail: chunheng.wang@ia.ac.cn</affiliation>
<role>
<roleTerm type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Ru-Wei</namePart>
<namePart type="family">Dai</namePart>
<affiliation>Laboratory of Complex System and Intelligence Science, Institute of Automation, Chinese Academy of Sciences, 100080, Beijing, China</affiliation>
<affiliation>E-mail: ruwei.dai@ia.ac.cn</affiliation>
<role>
<roleTerm type="text">author</roleTerm>
</role>
</name>
<typeOfResource>text</typeOfResource>
<genre type="conference [eBooks]" displayLabel="OriginalPaper"></genre>
<originInfo>
<publisher>Springer Berlin Heidelberg</publisher>
<place>
<placeTerm type="text">Berlin, Heidelberg</placeTerm>
</place>
<dateIssued encoding="w3cdtf">2006</dateIssued>
<copyrightDate encoding="w3cdtf">2006</copyrightDate>
</originInfo>
<language>
<languageTerm type="code" authority="rfc3066">en</languageTerm>
<languageTerm type="code" authority="iso639-2b">eng</languageTerm>
</language>
<physicalDescription>
<internetMediaType>text/html</internetMediaType>
</physicalDescription>
<abstract lang="en">Abstract: It is difficult to segment mixed Chinese/English documents when there are many italic characters scattered in documents. Most contributions attach more attention to English documents. However, mixed document is different from English document and some special features should be considered. This paper gives a new way to solve the problem. At first, an appropriate character area is chosen to detect italic. Next, a two-step strategy is adopted. Italic determination is done first and then if the character pattern is identified as italic, the estimation of slant angle will be done. Finally the italic character pattern is corrected by shear transform. A method of adopting two-step weighted projection profile histogram for italic determination is introduced. And a fast algorithm to estimate slant angle is also introduced. Three large sample collections, including character and character-pair and document respectively, are provided to evaluate our method and encouraging results are achieved.</abstract>
<relatedItem type="host">
<titleInfo>
<title>Computer Processing of Oriental Languages. Beyond the Orient: The Research Challenges Ahead</title>
<subTitle>21st International Conference, ICCPOL 2006, Singapore, December 17-19, 2006. Proceedings</subTitle>
</titleInfo>
<name type="personal">
<namePart type="given">Yuji</namePart>
<namePart type="family">Matsumoto</namePart>
<affiliation>Graduate School of Information Science, Nara Institute of Science and Technology, 630-0192, Takayama, Ikoma, Nara, Japan</affiliation>
<affiliation>E-mail: matsu@is.naist.jp</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Richard</namePart>
<namePart type="given">W.</namePart>
<namePart type="family">Sproat</namePart>
<affiliation>Dept of ECE, University of Illinois at Urbana Champaign, IL 61801, Urbana, USA</affiliation>
<affiliation>E-mail: rws@xoba.com</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Kam-Fai</namePart>
<namePart type="family">Wong</namePart>
<affiliation>Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong</affiliation>
<affiliation>E-mail: kfwong@se.cuhk.edu.hk</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Min</namePart>
<namePart type="family">Zhang</namePart>
<affiliation>State Key Lab of Intelligent Tech. & Sys., Tsinghua University</affiliation>
<affiliation>E-mail: miz14@pitt.edu</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<genre type="Book Series" displayLabel="Proceedings"></genre>
<originInfo>
<copyrightDate encoding="w3cdtf">2006</copyrightDate>
<issuance>monographic</issuance>
</originInfo>
<subject>
<genre>Book Subject Collection</genre>
<topic authority="SpringerSubjectCodes" authorityURI="SUCO11645">Computer Science</topic>
</subject>
<subject>
<genre>Book Subject Group</genre>
<topic authority="SpringerSubjectCodes" authorityURI="I">Computer Science</topic>
<topic authority="SpringerSubjectCodes" authorityURI="I21017">Artificial Intelligence (incl. Robotics)</topic>
<topic authority="SpringerSubjectCodes" authorityURI="I16048">Mathematical Logic and Formal Languages</topic>
<topic authority="SpringerSubjectCodes" authorityURI="I21041">Language Translation and Linguistics</topic>
<topic authority="SpringerSubjectCodes" authorityURI="I18030">Data Mining and Knowledge Discovery</topic>
<topic authority="SpringerSubjectCodes" authorityURI="I16021">Algorithm Analysis and Problem Complexity</topic>
<topic authority="SpringerSubjectCodes" authorityURI="I21033">Document Preparation and Text Processing</topic>
</subject>
<identifier type="DOI">10.1007/11940098</identifier>
<identifier type="ISBN">978-3-540-49667-0</identifier>
<identifier type="eISBN">978-3-540-49668-7</identifier>
<identifier type="ISSN">0302-9743</identifier>
<identifier type="eISSN">1611-3349</identifier>
<identifier type="BookTitleID">142982</identifier>
<identifier type="BookID">978-3-540-49668-7</identifier>
<identifier type="BookChapterCount">56</identifier>
<identifier type="BookVolumeNumber">4285</identifier>
<identifier type="BookSequenceNumber">4285</identifier>
<identifier type="PartChapterCount">5</identifier>
<part>
<date>2006</date>
<detail type="part">
<title>Information Retrieval/Document Classification/QA/ Summarization I</title>
</detail>
<detail type="volume">
<number>4285</number>
<caption>vol.</caption>
</detail>
<extent unit="pages">
<start>13</start>
<end>21</end>
</extent>
</part>
<recordInfo>
<recordOrigin>Springer-Verlag Berlin Heidelberg, 2006</recordOrigin>
</recordInfo>
</relatedItem>
<relatedItem type="series">
<titleInfo>
<title>Lecture Notes in Computer Science</title>
</titleInfo>
<name type="personal">
<namePart type="given">David</namePart>
<namePart type="family">Hutchison</namePart>
<affiliation>Lancaster University, UK</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Takeo</namePart>
<namePart type="family">Kanade</namePart>
<affiliation>Carnegie Mellon University, Pittsburgh, PA, USA</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Josef</namePart>
<namePart type="family">Kittler</namePart>
<affiliation>University of Surrey, Guildford, UK</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Jon</namePart>
<namePart type="given">M.</namePart>
<namePart type="family">Kleinberg</namePart>
<affiliation>Cornell University, Ithaca, NY, USA</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Friedemann</namePart>
<namePart type="family">Mattern</namePart>
<affiliation>ETH Zurich, Switzerland</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">John</namePart>
<namePart type="given">C.</namePart>
<namePart type="family">Mitchell</namePart>
<affiliation>Stanford University, CA, USA</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Moni</namePart>
<namePart type="family">Naor</namePart>
<affiliation>Weizmann Institute of Science, Rehovot, Israel</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Oscar</namePart>
<namePart type="family">Nierstrasz</namePart>
<affiliation>University of Bern, Switzerland</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">C.</namePart>
<namePart type="family">Pandu Rangan</namePart>
<affiliation>Indian Institute of Technology, Madras, India</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Bernhard</namePart>
<namePart type="family">Steffen</namePart>
<affiliation>University of Dortmund, Germany</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Madhu</namePart>
<namePart type="family">Sudan</namePart>
<affiliation>Massachusetts Institute of Technology, MA, USA</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Demetri</namePart>
<namePart type="family">Terzopoulos</namePart>
<affiliation>University of California, Los Angeles, CA, USA</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Dough</namePart>
<namePart type="family">Tygar</namePart>
<affiliation>University of California, Berkeley, CA, USA</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Moshe</namePart>
<namePart type="given">Y.</namePart>
<namePart type="family">Vardi</namePart>
<affiliation>Rice University, Houston, TX, USA</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Gerhard</namePart>
<namePart type="family">Weikum</namePart>
<affiliation>Max-Planck Institute of Computer Science, Saarbruecken, Germany</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<originInfo>
<copyrightDate encoding="w3cdtf">2006</copyrightDate>
<issuance>serial</issuance>
</originInfo>
<relatedItem type="constituent">
<titleInfo>
<title>Lecture Notes in Artificial Intelligence</title>
</titleInfo>
<name type="personal">
<namePart type="given">David</namePart>
<namePart type="family">Hutchison</namePart>
<affiliation>Lancaster University, UK</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Takeo</namePart>
<namePart type="family">Kanade</namePart>
<affiliation>Carnegie Mellon University, Pittsburgh, PA, USA</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Josef</namePart>
<namePart type="family">Kittler</namePart>
<affiliation>University of Surrey, Guildford, UK</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Jon</namePart>
<namePart type="given">M.</namePart>
<namePart type="family">Kleinberg</namePart>
<affiliation>Cornell University, Ithaca, NY, USA</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Friedemann</namePart>
<namePart type="family">Mattern</namePart>
<affiliation>ETH Zurich, Switzerland</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">John</namePart>
<namePart type="given">C.</namePart>
<namePart type="family">Mitchell</namePart>
<affiliation>Stanford University, CA, USA</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Moni</namePart>
<namePart type="family">Naor</namePart>
<affiliation>Weizmann Institute of Science, Rehovot, Israel</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Oscar</namePart>
<namePart type="family">Nierstrasz</namePart>
<affiliation>University of Bern, Switzerland</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">C.</namePart>
<namePart type="family">Pandu Rangan</namePart>
<affiliation>Indian Institute of Technology, Madras, India</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Bernhard</namePart>
<namePart type="family">Steffen</namePart>
<affiliation>University of Dortmund, Germany</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Madhu</namePart>
<namePart type="family">Sudan</namePart>
<affiliation>Massachusetts Institute of Technology, MA, USA</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Demetri</namePart>
<namePart type="family">Terzopoulos</namePart>
<affiliation>University of California, Los Angeles, CA, USA</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Dough</namePart>
<namePart type="family">Tygar</namePart>
<affiliation>University of California, Berkeley, CA, USA</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Moshe</namePart>
<namePart type="given">Y.</namePart>
<namePart type="family">Vardi</namePart>
<affiliation>Rice University, Houston, TX, USA</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Gerhard</namePart>
<namePart type="family">Weikum</namePart>
<affiliation>Max-Planck Institute of Computer Science, Saarbruecken, Germany</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Jaime</namePart>
<namePart type="given">G.</namePart>
<namePart type="family">Carbonell</namePart>
<affiliation>Carnegie Mellon University, Pittsburgh, PA, USA</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Jörg</namePart>
<namePart type="family">Siekmann</namePart>
<affiliation>University of Saarland, Saarbrücken, Germany</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Yuji</namePart>
<namePart type="family">Matsumoto</namePart>
<affiliation>Graduate School of Information Science, Nara Institute of Science and Technology, 630-0192, Takayama, Ikoma, Nara, Japan</affiliation>
<affiliation>E-mail: matsu@is.naist.jp</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Richard</namePart>
<namePart type="given">W.</namePart>
<namePart type="family">Sproat</namePart>
<affiliation>Dept of ECE, University of Illinois at Urbana Champaign, IL 61801, Urbana, USA</affiliation>
<affiliation>E-mail: rws@xoba.com</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Kam-Fai</namePart>
<namePart type="family">Wong</namePart>
<affiliation>Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong</affiliation>
<affiliation>E-mail: kfwong@se.cuhk.edu.hk</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Min</namePart>
<namePart type="family">Zhang</namePart>
<affiliation>State Key Lab of Intelligent Tech. & Sys., Tsinghua University</affiliation>
<affiliation>E-mail: miz14@pitt.edu</affiliation>
<role>
<roleTerm type="text">editor</roleTerm>
</role>
</name>
<genre type="Sub-Series"></genre>
<identifier type="ISSN">0302-9743</identifier>
<identifier type="eISSN">1611-3349</identifier>
<identifier type="SubSeriesID">1244</identifier>
</relatedItem>
<identifier type="ISSN">0302-9743</identifier>
<identifier type="eISSN">1611-3349</identifier>
<identifier type="SeriesID">558</identifier>
<recordInfo>
<recordOrigin>Springer-Verlag Berlin Heidelberg, 2006</recordOrigin>
</recordInfo>
</relatedItem>
<identifier type="istex">DD409C0DADEED45D7BE31E2E877D68EFE748A53D</identifier>
<identifier type="DOI">10.1007/11940098_2</identifier>
<identifier type="ChapterID">2</identifier>
<identifier type="ChapterID">Chap2</identifier>
<accessCondition type="use and reproduction" contentType="copyright">Springer-Verlag Berlin Heidelberg, 2006</accessCondition>
<recordInfo>
<recordContentSource>SPRINGER</recordContentSource>
<recordOrigin>Springer-Verlag Berlin Heidelberg, 2006</recordOrigin>
</recordInfo>
</mods>
</metadata>
<enrichments>
<istex:refBibTEI uri="https://api.istex.fr/document/DD409C0DADEED45D7BE31E2E877D68EFE748A53D/enrichments/refBib">
<teiHeader></teiHeader>
<text>
<front></front>
<body></body>
<back>
<listBibl>
<biblStruct xml:id="b0">
<analytic>
<title level="a" type="main">Application of Slant Correction to Handwritten Japanese Address Recognition</title>
<author>
<persName>
<forename type="first">Y</forename>
<forename type="middle">M</forename>
<surname>Ding</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">M</forename>
<surname>Okada</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">F</forename>
<surname>Kimura</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">Y</forename>
<surname>Miyake</surname>
</persName>
</author>
</analytic>
<monogr>
<title level="m">Proceedings of the Sixth International Conference on Document Analysis and Recognition</title>
<meeting>the Sixth International Conference on Document Analysis and Recognition</meeting>
<imprint>
<date type="published" when="2001"></date>
<biblScope unit="page" from="670" to="674"></biblScope>
</imprint>
</monogr>
</biblStruct>
<biblStruct xml:id="b1">
<analytic>
<title level="a" type="main">Slant estimation for handwritten words by directionally refined chain code</title>
<author>
<persName>
<forename type="first">Y</forename>
<forename type="middle">M</forename>
<surname>Ding</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">F</forename>
<surname>Kimura</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">Y</forename>
<surname>Miyake</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">M</forename>
<surname>Shridhar</surname>
</persName>
</author>
</analytic>
<monogr>
<title level="m">Proceedings of the Seventh International Workshop on Frontiers in Handwritten Recognition</title>
<meeting>the Seventh International Workshop on Frontiers in Handwritten Recognition</meeting>
<imprint>
<date type="published" when="2000"></date>
<biblScope unit="page" from="53" to="62"></biblScope>
</imprint>
</monogr>
</biblStruct>
<biblStruct xml:id="b2">
<analytic>
<title level="a" type="main">Local slant estimation for handwritten English words</title>
<author>
<persName>
<forename type="first">Y</forename>
<forename type="middle">M</forename>
<surname>Ding</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">W</forename>
<surname>Ohyama</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">F</forename>
<surname>Kimura</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">M</forename>
<surname>Shridhar</surname>
</persName>
</author>
</analytic>
<monogr>
<title level="m">Proceedings of the Ninth International Workshop on Frontiers in Handwritten Recognition</title>
<meeting>the Ninth International Workshop on Frontiers in Handwritten Recognition
<address>
<addrLine>Kokubunji, Tokyo, Japan</addrLine>
</address>
</meeting>
<imprint>
<date type="published" when="2004"></date>
<biblScope unit="page" from="328" to="333"></biblScope>
</imprint>
</monogr>
</biblStruct>
<biblStruct xml:id="b3">
<analytic>
<title level="a" type="main">A system for reading USA census '90 hand-written fields</title>
<author>
<persName>
<forename type="first">L</forename>
<surname>Simoncini</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">Zs</forename>
<forename type="middle">M</forename>
<surname>Kovacs-V</surname>
</persName>
</author>
</analytic>
<monogr>
<title level="m">Proceedings of the Third International Conference on Document Analysis and Recognition</title>
<meeting>the Third International Conference on Document Analysis and Recognition
<address>
<addrLine>Montreal</addrLine>
</address>
</meeting>
<imprint>
<date type="published" when="1995"></date>
<biblScope unit="page" from="86" to="91"></biblScope>
</imprint>
</monogr>
</biblStruct>
<biblStruct xml:id="b4">
<analytic>
<title level="a" type="main">Generalised projections: a tool for cursive character normalization</title>
<author>
<persName>
<forename type="first">G</forename>
<surname>Nicchiotti</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">C</forename>
<surname>Scagliola</surname>
</persName>
</author>
</analytic>
<monogr>
<title level="m">Proceedings of Fifth International Conference on Document Analysis and Recognition</title>
<meeting>Fifth International Conference on Document Analysis and Recognition
<address>
<addrLine>Bangalore</addrLine>
</address>
</meeting>
<imprint>
<date type="published" when="1999"></date>
</imprint>
</monogr>
</biblStruct>
<biblStruct xml:id="b5">
<analytic>
<title level="a" type="main">Italic Detection and Rectification</title>
<author>
<persName>
<forename type="first">K</forename>
<forename type="middle">C</forename>
<surname>Fan</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">C</forename>
<forename type="middle">H</forename>
<surname>Huang</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">T</forename>
<forename type="middle">C</forename>
<surname>Chuang</surname>
</persName>
</author>
</analytic>
<monogr>
<title level="m">Proceedings of 2005 International Conference on Image Processing</title>
<meeting>2005 International Conference on Image Processing</meeting>
<imprint>
<date type="published" when="2005"></date>
<biblScope unit="page" from="530" to="533"></biblScope>
</imprint>
</monogr>
</biblStruct>
<biblStruct xml:id="b6">
<analytic>
<title level="a" type="main">A segmentation method for touching italic characters</title>
<author>
<persName>
<forename type="first">Y</forename>
<surname>Li</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">S</forename>
<surname>Naoi</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">M</forename>
<surname>Cheriet</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">C</forename>
<forename type="middle">Y</forename>
<surname>Suen</surname>
</persName>
</author>
</analytic>
<monogr>
<title level="m">Proceedings of Seventeenth International Conference on Pattern Recognition</title>
<meeting>Seventeenth International Conference on Pattern Recognition</meeting>
<imprint>
<date type="published" when="2004"></date>
<biblScope unit="page" from="594" to="597"></biblScope>
</imprint>
</monogr>
</biblStruct>
<biblStruct xml:id="b7">
<analytic>
<title level="a" type="main">Restoration and segmentation of machine printed documents</title>
<author>
<persName>
<forename type="first">L</forename>
<surname>Su</surname>
</persName>
</author>
</analytic>
<monogr>
<title level="j">Canada</title>
<imprint>
<biblScope unit="page" from="92" to="95"></biblScope>
<date type="published" when="1996"></date>
</imprint>
</monogr>
</biblStruct>
<biblStruct xml:id="b8">
<analytic>
<title level="a" type="main">Skew and slant correction for document images using gradient direction</title>
<author>
<persName>
<forename type="first">C</forename>
<forename type="middle">M</forename>
<surname>Sun</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">D</forename>
<surname>Si</surname>
</persName>
</author>
</analytic>
<monogr>
<title level="m">Proceedings of the Fourth International Conference on Document Analysis and Recognition</title>
<meeting>the Fourth International Conference on Document Analysis and Recognition</meeting>
<imprint>
<date type="published" when="1997"></date>
<biblScope unit="page" from="142" to="146"></biblScope>
</imprint>
</monogr>
</biblStruct>
<biblStruct xml:id="b9">
<analytic>
<title level="a" type="main">Slant estimation of handwritten characters by means of Zernike moments</title>
<author>
<persName>
<forename type="first">J</forename>
<surname>Ballesteros</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">C</forename>
<forename type="middle">M</forename>
<surname>Travieso</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">J</forename>
<forename type="middle">B</forename>
<surname>Alonso</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">M</forename>
<forename type="middle">A</forename>
<surname>Ferrer</surname>
</persName>
</author>
</analytic>
<monogr>
<title level="j">Electronics Letters</title>
<imprint>
<biblScope unit="issue">20</biblScope>
<biblScope unit="page" from="41" to="1110"></biblScope>
<date type="published" when="2005"></date>
</imprint>
</monogr>
</biblStruct>
<biblStruct xml:id="b10">
<analytic>
<title level="a" type="main">Automatic detection of italic bold and all-capital words in document images</title>
<author>
<persName>
<forename type="first">B</forename>
<forename type="middle">B</forename>
<surname>Chaudhuri</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">U</forename>
<surname>Garain</surname>
</persName>
</author>
</analytic>
<monogr>
<title level="m">Proceedings of Fourteenth International Conference on Pattern Recognition</title>
<meeting>Fourteenth International Conference on Pattern Recognition</meeting>
<imprint>
<date type="published" when="1998"></date>
<biblScope unit="page" from="610" to="612"></biblScope>
</imprint>
</monogr>
</biblStruct>
<biblStruct xml:id="b11">
<analytic>
<title level="a" type="main">Slant estimation algorithm for OCR system</title>
<author>
<persName>
<forename type="first">E</forename>
<surname>Kavallieratou</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">N</forename>
<surname>Fakotakis</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">G</forename>
<surname>Kokkinakis</surname>
</persName>
</author>
</analytic>
<monogr>
<title level="j">Pattern Recognition</title>
<imprint>
<biblScope unit="volume">34</biblScope>
<biblScope unit="issue">12</biblScope>
<biblScope unit="page" from="2515" to="2522"></biblScope>
<date type="published" when="2001"></date>
</imprint>
</monogr>
</biblStruct>
<biblStruct xml:id="b12">
<analytic>
<title level="a" type="main">Off-line cursive script word recognition</title>
<author>
<persName>
<forename type="first">R</forename>
<forename type="middle">M</forename>
<surname>Bozinovic</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">S</forename>
<forename type="middle">N</forename>
<surname>Srihari</surname>
</persName>
</author>
</analytic>
<monogr>
<title level="j">IEEE Transactions on Pattern Analysis and Machine Intelligence</title>
<imprint>
<biblScope unit="volume">11</biblScope>
<biblScope unit="issue">1</biblScope>
<biblScope unit="page" from="68" to="83"></biblScope>
<date type="published" when="1989"></date>
</imprint>
</monogr>
</biblStruct>
<biblStruct xml:id="b13">
<analytic>
<title level="a" type="main">Segmentation of mixed Chinese/English document based on AFMPF model</title>
<author>
<persName>
<forename type="first">Y</forename>
<surname>Xia</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">C</forename>
<forename type="middle">H</forename>
<surname>Wang</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">R</forename>
<forename type="middle">W</forename>
<surname>Dai</surname>
</persName>
</author>
</analytic>
<monogr>
<title level="j">Acta Automatica Sinica</title>
<imprint>
<biblScope unit="volume">32</biblScope>
<biblScope unit="issue">3</biblScope>
<biblScope unit="page" from="353" to="359"></biblScope>
<date type="published" when="2006"></date>
</imprint>
</monogr>
</biblStruct>
<biblStruct xml:id="b14">
<analytic>
<title level="a" type="main">Segmentation of mixed Chinese/English documents based on Chinese Radicals recognition and complexity analysis in local segment pattern</title>
<author>
<persName>
<forename type="first">Y</forename>
<surname>Xia</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">Xiao</forename>
<forename type="middle">B H</forename>
<surname>Wang</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">C</forename>
<forename type="middle">H</forename>
<surname>Li</surname>
</persName>
</author>
<author>
<persName>
<forename type="first">Y</forename>
<forename type="middle">D</forename>
</persName>
</author>
</analytic>
<monogr>
<title level="j">Lecture Notes in Control and Information Sciences</title>
<imprint>
<publisher>Springer- Verlag</publisher>
<publisher>Springer- Verlag</publisher>
<biblScope unit="volume">345</biblScope>
<biblScope unit="page" from="497" to="506"></biblScope>
<date type="published" when="2006"></date>
</imprint>
</monogr>
</biblStruct>
</listBibl>
</back>
</text>
</istex:refBibTEI>
</enrichments>
</istex>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Istex/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 002815 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Istex/Corpus/biblio.hfd -nk 002815 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Istex
   |étape=   Corpus
   |type=    RBID
   |clé=     ISTEX:DD409C0DADEED45D7BE31E2E877D68EFE748A53D
   |texte=   Segmentation of Mixed Chinese/English Document Including Scattered Italic Characters
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024