Classifier Adaptation with Non-representative Training Data
Identifieur interne : 001927 ( Main/Curation ); précédent : 001926; suivant : 001928Classifier Adaptation with Non-representative Training Data
Auteurs : Sriharsha Veeramachaneni [États-Unis] ; George Nagy (informaticien) [États-Unis]Source :
- Lecture Notes in Computer Science [ 0302-9743 ] ; 2002.
Abstract
Abstract: We propose an adaptive methodology to tune the decision boundaries of a classifier trained on non-representative data to the statistics of the test data to improve accuracy. Specifically, for machine printed and handprinted digit recognition we demonstrate that adapting the class means alone can provide considerable gains in recognition. On machineprinted digits we adapt to the typeface, on hand-print to the writer. We recognize the digits with a Gaussian quadratic classifier when the style of the test set is represented by a subset of the training set, and also when it is not represented in the training set. We compare unsupervised adaptation and style-constrained classification on isogenous test sets of five machine-printed and two hand-printed NIST data sets. Both estimating mean and imposing style constraints reduce the error-rate in almost every case, and neither ever results in signi.cant loss. They are comparable under the first scenario (specialization), but adaptation is better under the second (new style). Adaptation is bene.cial when the test is large enough (even if only ten samples of each class by one writer in a 100- dimensional feature space), but style conscious classification is the only option with fields of only two or three digits.
Url:
DOI: 10.1007/3-540-45869-7_17
Links toward previous steps (curation, corpus...)
- to stream Istex, to step Corpus: Pour aller vers cette notice dans l'étape Curation :000B14
- to stream Istex, to step Curation: Pour aller vers cette notice dans l'étape Curation :000B01
- to stream Istex, to step Checkpoint: Pour aller vers cette notice dans l'étape Curation :001041
- to stream Main, to step Merge: Pour aller vers cette notice dans l'étape Curation :001A07
Links to Exploration step
ISTEX:BA6AC24A377F2F9A6379DAC3467543B5C8B7A845Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Classifier Adaptation with Non-representative Training Data</title>
<author><name sortKey="Veeramachaneni, Sriharsha" sort="Veeramachaneni, Sriharsha" uniqKey="Veeramachaneni S" first="Sriharsha" last="Veeramachaneni">Sriharsha Veeramachaneni</name>
</author>
<author><name sortKey="Nagy, George" sort="Nagy, George" uniqKey="Nagy G" first="George" last="Nagy">George Nagy (informaticien)</name>
<affiliation><country>États-Unis</country>
<placeName><settlement type="city">Troy (New York</settlement>
<region type="state">État de New York</region>
</placeName>
<orgName type="lab" n="5">Institut polytechnique Rensselaer</orgName>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:BA6AC24A377F2F9A6379DAC3467543B5C8B7A845</idno>
<date when="2002" year="2002">2002</date>
<idno type="doi">10.1007/3-540-45869-7_17</idno>
<idno type="url">https://api.istex.fr/document/BA6AC24A377F2F9A6379DAC3467543B5C8B7A845/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000B14</idno>
<idno type="wicri:Area/Istex/Curation">000B01</idno>
<idno type="wicri:Area/Istex/Checkpoint">001041</idno>
<idno type="wicri:doubleKey">0302-9743:2002:Veeramachaneni S:classifier:adaptation:with</idno>
<idno type="wicri:Area/Main/Merge">001A07</idno>
<idno type="wicri:Area/Main/Curation">001927</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Classifier Adaptation with Non-representative Training Data</title>
<author><name sortKey="Veeramachaneni, Sriharsha" sort="Veeramachaneni, Sriharsha" uniqKey="Veeramachaneni S" first="Sriharsha" last="Veeramachaneni">Sriharsha Veeramachaneni</name>
<affiliation wicri:level="2"><country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Rensselaer Polytechnic Institute, 12180, Troy, NY</wicri:regionArea>
<placeName><region type="state">État de New York</region>
</placeName>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
<author><name sortKey="Nagy, George" sort="Nagy, George" uniqKey="Nagy G" first="George" last="Nagy">George Nagy (informaticien)</name>
<affiliation wicri:level="2"><country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Rensselaer Polytechnic Institute, 12180, Troy, NY</wicri:regionArea>
<placeName><region type="state">État de New York</region>
</placeName>
<placeName><settlement type="city">Troy (New York</settlement>
<region type="state">État de New York</region>
</placeName>
<orgName type="lab" n="5">Institut polytechnique Rensselaer</orgName>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2002</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">BA6AC24A377F2F9A6379DAC3467543B5C8B7A845</idno>
<idno type="DOI">10.1007/3-540-45869-7_17</idno>
<idno type="ChapterID">17</idno>
<idno type="ChapterID">Chap17</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: We propose an adaptive methodology to tune the decision boundaries of a classifier trained on non-representative data to the statistics of the test data to improve accuracy. Specifically, for machine printed and handprinted digit recognition we demonstrate that adapting the class means alone can provide considerable gains in recognition. On machineprinted digits we adapt to the typeface, on hand-print to the writer. We recognize the digits with a Gaussian quadratic classifier when the style of the test set is represented by a subset of the training set, and also when it is not represented in the training set. We compare unsupervised adaptation and style-constrained classification on isogenous test sets of five machine-printed and two hand-printed NIST data sets. Both estimating mean and imposing style constraints reduce the error-rate in almost every case, and neither ever results in signi.cant loss. They are comparable under the first scenario (specialization), but adaptation is better under the second (new style). Adaptation is bene.cial when the test is large enough (even if only ten samples of each class by one writer in a 100- dimensional feature space), but style conscious classification is the only option with fields of only two or three digits.</div>
</front>
</TEI>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001927 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Curation/biblio.hfd -nk 001927 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Curation |type= RBID |clé= ISTEX:BA6AC24A377F2F9A6379DAC3467543B5C8B7A845 |texte= Classifier Adaptation with Non-representative Training Data }}
This area was generated with Dilib version V0.6.32. |