Back to Search
Start Over
Regularizing Conjunctive Features for Classification
- Source :
- PODS
- Publication Year :
- 2019
- Publisher :
- ACM, 2019.
-
Abstract
- We consider the feature-generation task wherein we are given a database with entities labeled as positive and negative examples, and the goal is to find feature queries that allow for a linear separation between the two sets of examples. We focus on conjunctive feature queries, and explore two fundamental problems: (a) deciding whether separating feature queries exist (separability), and (b) generating such queries when they exist. In the approximate versions of these problems, we allow a predefined fraction of the examples to be misclassified. To restrict the complexity of the generated classifiers, we explore various ways of regularizing (i.e., imposing simplicity constraints on) them by limiting their dimension, the number of joins in feature queries, and their generalized hypertree width (ghw). Among other results, we show that the separability problem is tractable in the case of bounded ghw; yet, the generation problem is intractable, simply because the feature queries might be too large. So, we explore a third problem: classifying new entities without necessarily generating the feature queries. Interestingly, in the case of bounded ghw we can efficiently classify without ever explicitly generating the feature queries.
- Subjects :
- Theoretical computer science
General Computer Science
Feature generation
Computer Networks and Communications
Computer science
Dimension (graph theory)
Generalized hypertree width
Separability
Joins
0102 computer and information sciences
02 engineering and technology
01 natural sciences
Theoretical Computer Science
Conjunctive queries
020204 information systems
0202 electrical engineering, electronic engineering, information engineering
Fraction (mathematics)
Hyperbolic tree
Applied Mathematics
020207 software engineering
Limiting
Classification
Task (computing)
Computational Theory and Mathematics
010201 computation theory & mathematics
restrict
Feature (computer vision)
Bounded function
Conjunctive query
Focus (optics)
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- Proceedings of the 38th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems
- Accession number :
- edsair.doi.dedup.....c2dbb3e935a309db5ffd9a805eba2428
- Full Text :
- https://doi.org/10.1145/3294052.3319680