Back to Search
Start Over
Using demographics toward efficient data classification in citizen science: a Bayesian approach
- Source :
- PeerJ Computer Science, Vol 5, p e239 (2019), PeerJ Computer Science
- Publication Year :
- 2019
- Publisher :
- PeerJ Inc., 2019.
-
Abstract
- Public participation in scientific activities, often called citizen science, offers a possibility to collect and analyze an unprecedentedly large amount of data. However, diversity of volunteers poses a challenge to obtain accurate information when these data are aggregated. To overcome this problem, we propose a classification algorithm using Bayesian inference that harnesses diversity of volunteers to improve data accuracy. In the algorithm, each volunteer is grouped into a distinct class based on a survey regarding either their level of education or motivation to citizen science. We obtained the behavior of each class through a training set, which was then used as a prior information to estimate performance of new volunteers. By applying this approach to an existing citizen science dataset to classify images into categories, we demonstrate improvement in data accuracy, compared to the traditional majority voting. Our algorithm offers a simple, yet powerful, way to improve data accuracy under limited effort of volunteers by predicting the behavior of a class of individuals, rather than attempting at a granular description of each of them.
- Subjects :
- Majority rule
General Computer Science
Computer science
Data classification
Bayesian probability
Citizen science
050905 science studies
Machine learning
computer.software_genre
Bayesian inference
lcsh:QA75.5-76.95
Scientific Computing and Simulation
Bayes estimator
business.industry
05 social sciences
Bayesian estimation
Class (biology)
Algorithm
Algorithms and Analysis of Algorithms
Public participation
Artificial intelligence
lcsh:Electronic computers. Computer science
0509 other social sciences
050904 information & library sciences
business
computer
Algorithms
Subjects
Details
- Language :
- English
- ISSN :
- 23765992
- Volume :
- 5
- Database :
- OpenAIRE
- Journal :
- PeerJ Computer Science
- Accession number :
- edsair.doi.dedup.....7708ce7dde5cddad7cf10d0721c16dc2