Back to Search
Start Over
Efficient Detection of Soft Concatenation Mapping.
- Source :
-
IEEE Transactions on Knowledge & Data Engineering . 11/1/2018, Vol. 30 Issue 11, p2106-2119. 14p. - Publication Year :
- 2018
-
Abstract
- In modern big data warehouse systems, we observe a common phenomenon that a column of data values can be derived from one or several other columns by transforming and concatenating these columns. We call this relationship between columns a Soft Concatenation Mapping (SCM). SCMs imply significant redundancy in the schema or data, and therefore can be exploited for data integration or data compression. In this paper, we formalize the problem of SCM detection and prove it is NP-hard. We then propose efficient approximate algorithms to detect all SCMs or an optimal set of SCMs in a table. Our experiments on both real-world and synthetic datasets show promising results. [ABSTRACT FROM AUTHOR]
- Subjects :
- *DATA warehousing
*DATA mining
*BIG data
*DATA compression
*EMAIL
*DATA integration
Subjects
Details
- Language :
- English
- ISSN :
- 10414347
- Volume :
- 30
- Issue :
- 11
- Database :
- Academic Search Index
- Journal :
- IEEE Transactions on Knowledge & Data Engineering
- Publication Type :
- Academic Journal
- Accession number :
- 132209517
- Full Text :
- https://doi.org/10.1109/TKDE.2018.2812822