Back to Search Start Over

SEMI-SUPERVISED LEARNING: EXPLOITING UNLABELED DATA WITH SYMMETRICAL DISTRIBUTION AND HIGH CONFIDENCE.

Authors :
YIHAO ZHANG
JUNHAO WEN
FANGFANG TANG
ZHUO JIANG
Source :
International Journal of Pattern Recognition & Artificial Intelligence; Nov2012, Vol. 26 Issue 7, p1-18, 18p, 1 Diagram, 8 Charts, 4 Graphs
Publication Year :
2012

Abstract

Current existing representative works to semi-supervised incremental learning prefer to select unlabeled instances predicted with high confidence for model retraining. However, this strategy may degrade the classification performance rather than improve it, because relying on high confidence for data selection can lead to an erroneous estimate to the true distribution, espe-cially when the confidence annotator is highly correlated with the confidence annotator. In this paper, a new semi-supervised incremental learning algorithm was proposed, which selected the high confidence unlabeled instances with symmetrical distribution from unlabeled data, it can reduce the bias in the estimation in some degree. In detail, expectation maximization algorithm was used to estimate the confidence of each instance, and Gaussian function was used to calculate the data distribution, then the selected unlabeled data was used for retraining model with classifier algorithm. The experimental results based on a large number of UCI data sets show that our algorithm can effectively exploit unlabeled data to enhance the learning performance. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
02180014
Volume :
26
Issue :
7
Database :
Complementary Index
Journal :
International Journal of Pattern Recognition & Artificial Intelligence
Publication Type :
Academic Journal
Accession number :
85990286
Full Text :
https://doi.org/10.1142/S0218001412510032