Back to Search Start Over

Efficient discovery of similarity constraints for matching dependencies.

Authors :
Song, Shaoxu
Chen, Lei
Source :
Data & Knowledge Engineering. Sep2013, Vol. 87, p146-166. 21p.
Publication Year :
2013

Abstract

Abstract: The concept of matching dependencies (mds) has recently been proposed for specifying matching rules for object identification. Similar to the functional dependencies (with conditions), mds can also be applied to various data quality applications such as detecting the violations of integrity constraints. In this paper, we study the problem of discovering similarity constraints for matching dependencies from a given database instance. First, we introduce the measures, support and confidence, for evaluating the utility of mds in the given data. Then, we study the discovery of mds with certain utility requirements of support and confidence. Exact algorithms are developed, together with pruning strategies to improve the time performance. Since the exact algorithm has to traverse all the data during the computation, we propose an approximate solution which only uses part of the data. A bound of relative errors introduced by the approximation is also developed. Finally, our experimental evaluation demonstrates the efficiency of the proposed methods. [Copyright &y& Elsevier]

Details

Language :
English
ISSN :
0169023X
Volume :
87
Database :
Academic Search Index
Journal :
Data & Knowledge Engineering
Publication Type :
Academic Journal
Accession number :
90522520
Full Text :
https://doi.org/10.1016/j.datak.2013.06.003