Start Over

GS4: Generating Synthetic Samples for Semi-Supervised Nearest Neighbor Classification

Authors :: Panagiotis Moutafis
Ioannis A. Kakadiaris
Source :: Lecture Notes in Computer Science ISBN: 9783319131856, PAKDD Workshops
Publication Year :: 2014
Publisher :: Springer International Publishing, 2014.
Abstract: In this paper, we propose a method to improve nearest neighbor classification accuracy under a semi-supervised setting. We call our approach GS4 (i.e., Generating Synthetic Samples Semi-Supervised). Existing self-training approaches classify unlabeled samples by exploiting local information. These samples are then incorporated into the training set of labeled data. However, errors are propagated and misclassifications at an early stage severely degrade the classification accuracy. To address this problem, the proposed method exploits the unlabeled data by using weights proportional to the classification confidence to generate synthetic samples. Specifically, our scheme is inspired by the Synthetic Minority Over-Sampling Technique. That is, each unlabeled sample is used to generate as many labeled samples as the number of classes represented by its \(k\)-nearest neighbors. In particular, the distance of each synthetic sample from its \(k\)-nearest neighbors of the same class is proportional to the classification confidence. As a result, the robustness to misclassification errors is increased and better accuracy is achieved. Experimental results using publicly available datasets demonstrate that statistically significant improvements are obtained when the proposed approach is employed.

Subjects :: ComputingMethodologies_PATTERNRECOGNITION
Training set
Robustness (computer science)
business.industry
Computer science
Labeled data
Sample (statistics)
Pattern recognition
Artificial intelligence
Semi-supervised learning
business
Class (biology)
k-nearest neighbors algorithm

Details

ISBN :: 978-3-319-13185-6
ISBNs :: 9783319131856
Database :: OpenAIRE
Journal :: Lecture Notes in Computer Science ISBN: 9783319131856, PAKDD Workshops
Accession number :: edsair.doi...........0553b20a3705b6783a8636bcbafd71fa
Full Text :: https://doi.org/10.1007/978-3-319-13186-3_36

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

GS4: Generating Synthetic Samples for Semi-Supervised Nearest Neighbor Classification

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

GS4: Generating Synthetic Samples for Semi-Supervised Nearest Neighbor Classification

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources