Back to Search Start Over

TODQA: Efficient Task-Oriented Data Quality Assessment

Authors :
Xie Yunting
Xiang Xiao
Anran Li
Xiang-Yang Li
Jianwei Qian
Lan Zhang
Source :
MSN
Publication Year :
2019
Publisher :
IEEE, 2019.

Abstract

Data quality assessment is vital for many information services ranging from sensor networks to smart city systems. The current data quality assessments, however, are often derived from intrinsic data characteristics, disconnected from specific application contexts, or are not applicable or efficient for large datasets. In this work, we propose a novel task-oriented data quality assessment framework, which balances between the intrinsic and contextual quality. We carefully craft the assessment metrics, quantify them, and fuse them to rank candidate datasets by quality given specific tasks. To improve the system efficiency, two fast calculation algorithms are designed to quantify the relationship between datasets and the task, and the distribution of data items. We conduct extensive evaluations on six public image datasets (with 460, 247 images in total) and four text document datasets (with 37, 372 documents in total) to evaluate the efficacy and efficiency of our design. Experimental results show that our algorithms can save about 90% computing time with little accuracy loss which validates the feasibility and effectiveness of our framework for large datasets.

Details

Database :
OpenAIRE
Journal :
2019 15th International Conference on Mobile Ad-Hoc and Sensor Networks (MSN)
Accession number :
edsair.doi...........4ee33034cfca2e2689778a086b1b6a6f