Back to Search Start Over

Bias Correction in a Small Sample from Big Data.

Authors :
Lu, Jianguo
Li, Dingding
Source :
IEEE Transactions on Knowledge & Data Engineering. Nov2013, Vol. 25 Issue 11, p2658-2663. 6p.
Publication Year :
2013

Abstract

This paper discusses the bias problem when estimating the population size of big data such as online social networks (OSN) using uniform random sampling and simple random walk. Unlike the traditional estimation problem where the sample size is not very small relative to the data size, in big data, a small sample relative to the data size is already very large and costly to obtain. We point out that when small samples are used, there is a bias that is no longer negligible. This paper shows analytically that the relative bias can be approximated by the reciprocal of the number of collisions; thereby, a bias correction estimator is introduced. The result is further supported by both simulation studies and the real Twitter network that contains 41.7 million nodes. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
10414347
Volume :
25
Issue :
11
Database :
Academic Search Index
Journal :
IEEE Transactions on Knowledge & Data Engineering
Publication Type :
Academic Journal
Accession number :
90678200
Full Text :
https://doi.org/10.1109/TKDE.2012.220