1. Learning efficiently over heterogeneous databases
- Author
-
Sudhanshu Pathak, Arash Termehchy, and Jose Picado
- Subjects
Matching (statistics) ,Database ,Relation (database) ,Relational database ,Process (engineering) ,Computer science ,Similarity (psychology) ,General Engineering ,Statistical relational learning ,computer.software_genre ,computer ,Datalog ,computer.programming_language - Abstract
Given a relational database and training examples for a target relation, relational learning algorithms learn a Datalog program that defines the target relation in terms of the existing relations in the database. We demonstrate CastorX, a relational learning system that performs relational learning over heterogeneous databases. The user specifies matching attributes between (heterogeneous) databases through matching dependencies. Because the content in these attributes may not match exactly, CastorX uses similarity operators to find matching values in these attributes. As the learning process may become expensive, CastorX implements sampling techniques that allow it to learn efficiently and output accurate definitions.
- Published
- 2018
- Full Text
- View/download PDF