6 results on '"Benny Kimelfeld"'
Search Results
2. The Complexity of Aggregates over Extractions by Regular Expressions
- Author
-
Johannes Doleschal, Benny Kimelfeld, and Wim Martens
- Subjects
computer science - databases ,computer science - formal languages and automata theory ,Logic ,BC1-199 ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Regular expressions with capture variables, also known as regex-formulas, extract relations of spans (intervals identified by their start and end indices) from text. In turn, the class of regular document spanners is the closure of the regex formulas under the Relational Algebra. We investigate the computational complexity of querying text by aggregate functions, such as sum, average, and quantile, on top of regular document spanners. To this end, we formally define aggregate functions over regular document spanners and analyze the computational complexity of exact and approximate computation. More precisely, we show that in a restricted case, all studied aggregate functions can be computed in polynomial time. In general, however, even though exact computation is intractable, some aggregates can still be approximated with fully polynomial-time randomized approximation schemes (FPRAS).
- Published
- 2023
- Full Text
- View/download PDF
3. Uniform Reliability of Self-Join-Free Conjunctive Queries
- Author
-
Antoine Amarilli and Benny Kimelfeld
- Subjects
computer science - databases ,Logic ,BC1-199 ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
The reliability of a Boolean Conjunctive Query (CQ) over a tuple-independent probabilistic database is the probability that the CQ is satisfied when the tuples of the database are sampled one by one, independently, with their associated probability. For queries without self-joins (repeated relation symbols), the data complexity of this problem is fully characterized by a known dichotomy: reliability can be computed in polynomial time for hierarchical queries, and is #P-hard for non-hierarchical queries. Inspired by this dichotomy, we investigate a fundamental counting problem for CQs without self-joins: how many sets of facts from the input database satisfy the query? This is equivalent to the uniform case of the query reliability problem, where the probability of every tuple is required to be 1/2. Of course, for hierarchical queries, uniform reliability is solvable in polynomial time, like the reliability problem. We show that being hierarchical is also necessary for this tractability (under conventional complexity assumptions). In fact, we establish a generalization of the dichotomy that covers every restricted case of reliability in which the probabilities of tuples are determined by their relation.
- Published
- 2022
- Full Text
- View/download PDF
4. The Shapley Value of Inconsistency Measures for Functional Dependencies
- Author
-
Ester Livshits and Benny Kimelfeld
- Subjects
computer science - databases ,Logic ,BC1-199 ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Quantifying the inconsistency of a database is motivated by various goals including reliability estimation for new datasets and progress indication in data cleaning. Another goal is to attribute to individual tuples a level of responsibility to the overall inconsistency, and thereby prioritize tuples in the explanation or inspection of dirt. Therefore, inconsistency quantification and attribution have been a subject of much research in Knowledge Representation and, more recently, in Databases. As in many other fields, a conventional responsibility sharing mechanism is the Shapley value from cooperative game theory. In this paper, we carry out a systematic investigation of the complexity of the Shapley value in common inconsistency measures for functional-dependency (FD) violations. For several measures we establish a full classification of the FD sets into tractable and intractable classes with respect to Shapley-value computation. We also study the complexity of approximation in intractable cases.
- Published
- 2022
- Full Text
- View/download PDF
5. Weight Annotation in Information Extraction
- Author
-
Johannes Doleschal, Benny Kimelfeld, Wim Martens, and Liat Peterfreund
- Subjects
computer science - databases ,computer science - formal languages and automata theory ,computer science - logic in computer science ,Logic ,BC1-199 ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
The framework of document spanners abstracts the task of information extraction from text as a function that maps every document (a string) into a relation over the document's spans (intervals identified by their start and end indices). For instance, the regular spanners are the closure under the Relational Algebra (RA) of the regular expressions with capture variables, and the expressive power of the regular spanners is precisely captured by the class of VSet-automata -- a restricted class of transducers that mark the endpoints of selected spans. In this work, we embark on the investigation of document spanners that can annotate extractions with auxiliary information such as confidence, support, and confidentiality measures. To this end, we adopt the abstraction of provenance semirings by Green et al., where tuples of a relation are annotated with the elements of a commutative semiring, and where the annotation propagates through the positive RA operators via the semiring operators. Hence, the proposed spanner extension, referred to as an annotator, maps every string into an annotated relation over the spans. As a specific instantiation, we explore weighted VSet-automata that, similarly to weighted automata and transducers, attach semiring elements to transitions. We investigate key aspects of expressiveness, such as the closure under the positive RA, and key aspects of computational complexity, such as the enumeration of annotated answers and their ranked enumeration in the case of ordered semirings. For a number of these problems, fundamental properties of the underlying semiring, such as positivity, are crucial for establishing tractability.
- Published
- 2022
- Full Text
- View/download PDF
6. The Shapley Value of Tuples in Query Answering
- Author
-
Ester Livshits, Leopoldo Bertossi, Benny Kimelfeld, and Moshe Sebag
- Subjects
computer science - databases ,Logic ,BC1-199 ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
We investigate the application of the Shapley value to quantifying the contribution of a tuple to a query answer. The Shapley value is a widely known numerical measure in cooperative game theory and in many applications of game theory for assessing the contribution of a player to a coalition game. It has been established already in the 1950s, and is theoretically justified by being the very single wealth-distribution measure that satisfies some natural axioms. While this value has been investigated in several areas, it received little attention in data management. We study this measure in the context of conjunctive and aggregate queries by defining corresponding coalition games. We provide algorithmic and complexity-theoretic results on the computation of Shapley-based contributions to query answers; and for the hard cases we present approximation algorithms.
- Published
- 2021
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.