16 results on '"Hucke, Danny"'
Search Results
2. Tree Compression Using String Grammars
- Author
-
Ganardi, Moses, Hucke, Danny, Lohrey, Markus, and Noeth, Eric
- Published
- 2017
- Full Text
- View/download PDF
3. Entropy Bounds for Grammar-Based Tree Compressors.
- Author
-
Hucke, Danny, Lohrey, Markus, and Benkner, Louisa Seelbach
- Subjects
- *
ENTROPY (Information theory) , *COMPRESSORS , *TREE size , *TREES , *DOCUMENT clustering - Abstract
The definition of $\text {k}^{th}$ -order empirical entropy of strings is extended to node-labelled binary trees: A notion of $\text {k}^{th}$ -order empirical entropy for node-labelled binary trees is proposed that is able to capture regularities in both labels and structure of a tree. A suitable binary encoding of tree straight-line programs (that have been used for grammar-based tree compression before) is shown to yield binary tree encodings of size bounded by the $\text {k}^{th}$ -order empirical entropy plus some lower order terms. This result is then extended from node-labelled binary trees to node-labelled unranked trees. This generalizes recent results for grammar-based string compression to grammar-based tree compression. Additionally, experimental results with real XML document trees are presented, in which the proposed notion of $\text {k}^{th}$ -order empirical tree entropy is computed and compared to the performance of grammar-based tree compressors for those XML document trees. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
4. Grammar-based compression for strings and trees
- Author
-
Hucke, Danny
- Subjects
strings ,Zeichenkette ,Datenkompression ,Kompression ,Bäume ,004 Informatik ,Grammar-based compression ,Kontextfreie Grammatik - Abstract
Das Ziel grammatik-basierter Datenkompression ist es ein Wort (oder einen Text) durch eine kontextfreie Grammatik darzustellen, die ausschließlich dieses Wort erzeugt. Eine kontextfreie Grammatik mit dieser Eigenschaft wird als Straight-line Program (SLP) bezeichnet. Grammatik-basierte Kompression ist ein nützliches Werkzeug um Daten zu komprimieren und verschiedene Anfragen effizient mit Hilfe der komprimierten Repräsentation zu beantworten ohne diese vorher zu dekomprimieren. Im ersten Teil dieser Arbeit werden die grammatikbasierten Kompressoren LZ78, BiSection, RePair, Greedy und LongestMatch untersucht, die zu den bekanntesten Algorithmen in diesem Bereich zählen. In der bedeutenden Veröffentlichung "The smallest grammar problem" von Charikar et al. wurden untere und obere Schranken für die Approximationsgüte verschiedenster grammatik-basierter Kompressoren gezeigt, darunter die oben genannten. Leider besteht für jeden Algorithmus eine Lücke zwischen der unteren und oberen Schranke. In dieser Arbeit werden diese Lücken für die Kompressoren LZ78 und BiSection geschlossen. Des Weiteren werd für RePair und für Greedy die unteren Schranken verbessert. Ein weiteres Ergebnis dieser Arbeit verbessert ein Resultat von Arpe und Reischuk, welches grammatik-basierte Kompression über beliebigen Alphabeten und binären Alphabeten verbindet. Im zweiten Teil dieser Arbeit betrachten wir grammatik-basierte Kompression für Bäume. Das Prinzip ist ähnlich, da in diesem Kontext ein Baum durch eine lineare, kontextfreie Baumgrammatik erzeugt wird. Eine solche Grammatik wird Tree Straight-line Program (TSLP) genannt. Als Hauptresultat in diesem Zusammenhang werden zwei Algorithmen präsentiert, die für einen Baum mit n Knoten und konstant vielen verschiedenen Beschriftungen der Knoten ein TSLP der Gr öße O(n/log n) generieren. Zusätzlich wird erreicht, dass das TSLP logarithmische Tiefe hat. Diese Eigenschaften können in logarithmischen Platz oder alternativ in linearer Zeit erreicht werden. Entsprechende Resultate für grammatik-basierte Kompression von Wörtern sind wohl bekannt. Es werden zusätzlich zwei Anwendungen präsentiert: Zuerst werden TSLPs für die Umwandlung von arithmetischen Formeln zu arithmetischen Schaltkreisen der Größe O((n*log m)/log n) und Tiefe O(log n) genutzt, wobei n die Größe der Formel und m die Anzahl verschiedener Variablen in der Formel ist. Als zweite Anwendung wird eine binäre Kodierung von unbeschrifteten Binärbäumen auf der Basis von grammatik-basierter Baumkompression präsentiert. Es wird gezeigt, dass diese Kodierung unter gewissen Voraussetzungen asymptotisch optimal ist., The goal of grammar-based compression is to represent a string by a small context-free grammar that produces only this string. Such a grammar is called a straight-line program (SLP). Grammar-based compression is a powerful tool to efficiently store data and process the compressed representation without decompressing it. In the first part of this work, we study the grammar-based compressors LZ78, BiSection, Repair, Greedy and LongestMatch, which are among the most popular compressors in this area. In the seminal work "The smallest grammar problem" by Charikar et al., the authors derived lower and upper bounds on the approximation ratios of several grammar-based compressors including the algorithms mentioned above. Unfortunately, for none of the compressors the presented bounds matched. Here, we close the gaps for LZ78 and BiSection. For RePair and Greedy we improve the lower bounds. Moreover, we improve a result of Arpe and Reischuk which relates grammar-based compression for arbitrary alphabets and binary alphabets. In the second part of this work, we consider grammar-based compression for trees. The main principle is similar, because the goal is to represent a tree by a small linear context-free tree grammar that produces only this tree. Such a tree grammar is called a tree straight-line program (TSLP). As a main contribution, we present two algorithms that produce a TSLP of size O(n/log n) for any given tree with n nodes and a constant set of node labels, where we assume that the maximal number of children of a node in the tree is also bounded by a constant. Additionally, the obtained TSLP has logarithmic depth. We show that those properties can be achieved in logarithmic space, or alternatively, in linear time. Similar results on the worst-case size of SLPs are well known. We use our constructions for two applications: First, we apply TSLPs to the problem of transforming arithmetical formulas into equivalent circuits of size O((n*log m)/log n) and depth O(log n), where n is the size of the formula and m is number of different variables occurring in the formula. As a second application, we present a binary encoding of unlabeled binary trees based on grammar-based tree compression. We prove that this encoding is worst-case universal and thus asymptotically optimal for certain tree sources.
- Published
- 2019
- Full Text
- View/download PDF
5. Sliding Window Property Testing for Regular Languages
- Author
-
Ganardi, Moses, Hucke, Danny, Lohrey, Markus, Starikovskaya, Tatiana, Universität Siegen [Siegen], Département d'informatique - ENS Paris (DI-ENS), École normale supérieure - Paris (ENS-PSL), and Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
FOS: Computer and information sciences ,Computer Science - Computation and Language ,000 Computer science, knowledge, general works ,Formal Languages and Automata Theory (cs.FL) ,Computer Science - Formal Languages and Automata Theory ,2012 ACM Subject Classification Theory of computation → Streaming ,regular languages ,2012 ACM Subject Classification Theory of computation → Streaming sublinear and near linear time algorithms phrases Streaming algorithms approximation algorithms regular languages ,Computer Science - Data Structures and Algorithms ,Computer Science ,[INFO]Computer Science [cs] ,Data Structures and Algorithms (cs.DS) ,sublinear and near linear time algorithms phrases Streaming algorithms ,approximation algorithms ,Computation and Language (cs.CL) - Abstract
We study the problem of recognizing regular languages in a variant of the streaming model of computation, called the sliding window model. In this model, we are given a size of the sliding window $n$ and a stream of symbols. At each time instant, we must decide whether the suffix of length $n$ of the current stream ("the active window") belongs to a given regular language. Recent works showed that the space complexity of an optimal deterministic sliding window algorithm for this problem is either constant, logarithmic or linear in the window size $n$ and provided natural language theoretic characterizations of the space complexity classes. Subsequently, those results were extended to randomized algorithms to show that any such algorithm admits either constant, double logarithmic, logarithmic or linear space complexity. In this work, we make an important step forward and combine the sliding window model with the property testing setting, which results in ultra-efficient algorithms for all regular languages. Informally, a sliding window property tester must accept the active window if it belongs to the language and reject it if it is far from the language. We consider deterministic and randomized sliding window property testers with one-sided and two-sided errors. In particular, we show that for any regular language, there is a deterministic sliding window property tester that uses logarithmic space and a randomized sliding window property tester with two-sided error that uses constant space., Comment: A short version of this paper was accepted for presentation at ISAAC 2019
- Published
- 2019
- Full Text
- View/download PDF
6. Automata theory on sliding windows
- Author
-
Ganardi, Moses, Hucke, Danny, König, Daniel, Lohrey, Markus, and Mamouras, Konstantinos
- Subjects
060201 languages & linguistics ,FOS: Computer and information sciences ,000 Computer science, knowledge, general works ,Formal Languages and Automata Theory (cs.FL) ,Computer Science - Formal Languages and Automata Theory ,06 humanities and the arts ,02 engineering and technology ,68Q45, 68W32 ,0602 languages and literature ,Computer Science ,Computer Science - Data Structures and Algorithms ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Data Structures and Algorithms (cs.DS) - Abstract
In a recent paper we analyzed the space complexity of streaming algorithms whose goal is to decide membership of a sliding window to a fixed language. For the class of regular languages we proved a space trichotomy theorem: for every regular language the optimal space bound is either constant, logarithmic or linear. In this paper we continue this line of research: We present natural characterizations for the constant and logarithmic space classes and establish tight relationships to the concept of language growth. We also analyze the space complexity with respect to automata size and prove almost matching lower and upper bounds. Finally, we consider the decision problem whether a language given by a DFA/NFA admits a sliding window algorithm using logarithmic/constant space., Extended version of a paper presented at STACS 2018
- Published
- 2017
7. Universal Tree Source Coding Using Grammar-Based Compression.
- Author
-
Ganardi, Moses, Hucke, Danny, Lohrey, Markus, and Seelbach Benkner, Louisa
- Subjects
- *
SOURCE code , *DIRECTED acyclic graphs , *CODING theory , *MAXIMAL functions , *BINARY codes , *TREES - Abstract
The problem of universal source coding for binary trees is considered. Zhang, Yang, and Kieffer derived upper bounds on the average-case redundancy of codes based on directed acyclic graph (DAG) compression for binary tree sources with certain properties. In this paper, a natural class of binary tree sources is presented such that the demanded properties are fulfilled. Moreover, for both subclasses considered in the paper of Zhang, Yang, and Kieffer, their result is improved by deriving bounds on the maximal pointwise redundancy (or worst-case redundancy) instead of the average-case redundancy. Finally, using context-free tree grammars instead of DAGs, upper bounds on the maximal pointwise redundancy for certain binary tree sources are derived. This yields universal codes for new classes of binary tree sources. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
8. Circuit Evaluation for Finite Semirings
- Author
-
Ganardi, Moses, Hucke, Danny, König, Daniel, and Lohrey, Markus
- Subjects
FOS: Computer and information sciences ,Computer Science - Computational Complexity ,000 Computer science, knowledge, general works ,010201 computation theory & mathematics ,010102 general mathematics ,Computer Science ,68W30, 16Z05, 68W10 ,0102 computer and information sciences ,0101 mathematics ,Computational Complexity (cs.CC) ,01 natural sciences - Abstract
The computational complexity of the circuit evaluation problem for finite semirings is considered, where semirings are not assumed to have an additive or multiplicative identity. The following dichotomy is shown: If a finite semiring is such that (i) the multiplicative semigroup is solvable and (ii) it does not contain a subsemiring with an additive identity $0$ and a multiplicative identity $1 \neq 0$, then the circuit evaluation problem for the semiring is in $\mathsf{DET} \subseteq \mathsf{NC}^2$. In all other cases, the circuit evaluation problem is $\mathsf{P}$-complete., Some proof details from the previous version are simplified in the new version
- Published
- 2016
9. Querying Regular Languages over Sliding Windows
- Author
-
Ganardi, Moses, Hucke, Danny, and Lohrey, Markus
- Subjects
060201 languages & linguistics ,000 Computer science, knowledge, general works ,0602 languages and literature ,Computer Science ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,06 humanities and the arts ,02 engineering and technology - Abstract
We study the space complexity of querying regular languages over data streams in the sliding window model. The algorithm has to answer at any point of time whether the content of the sliding window belongs to a fixed regular language. A trichotomy is shown: For every regular language the optimal space requirement is either in Theta(n), Theta(log(n)), or constant, where $n$ is the size of the sliding window.
- Published
- 2016
- Full Text
- View/download PDF
10. Constructing Small Tree Grammars and Small Circuits for Formulas
- Author
-
Hucke, Danny, Lohrey, Markus, and Noeth, Eric
- Subjects
000 Computer science, knowledge, general works ,Computer Science - Abstract
It is shown that every tree of size n over a fixed set of sigma different ranked symbols can be decomposed into O(n/log_sigma(n)) = O((n * log(sigma))/ log(n)) many hierarchically defined pieces. Formally, such a hierarchical decomposition has the form of a straight-line linear context-free tree grammar of size O(n/log_sigma(n)), which can be used as a compressed representation of the input tree. This generalizes an analogous result for strings. Previous grammar-based tree compressors were not analyzed for the worst-case size of the computed grammar, except for the top dag of Bille et al., for which only the weaker upper bound of O(n/log^{0.19}(n)) for unranked and unlabelled trees has been derived. The main result is used to show that every arithmetical formula of size n, in which only m
- Published
- 2014
- Full Text
- View/download PDF
11. Tree Compression Using String Grammars.
- Author
-
Ganardi, Moses, Hucke, Danny, Lohrey, Markus, and Noeth, Eric
- Subjects
- *
STRING theory , *STRAIGHT-line mechanisms , *BOOLEAN functions , *PATTERN matching , *QUERY languages (Computer science) - Abstract
We study the compressed representation of a ranked tree by a (string) straight-line program (SLP) for its preorder traversal, and compare it with the well-studied representation by straight-line context free tree grammars (which are also known as tree straight-line programs or TSLPs). Although SLPs may be exponentially more succinct than TSLPs, we show that many simple tree queries can still be performed efficiently on SLPs, such as computing the height of a tree, tree navigation, or evaluation of Boolean expressions. Other problems on tree traversals turn out to be intractable, e.g. pattern matching and evaluation of tree automata. These problems can be still solved in polynomial time for TSLPs. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
12. The Smallest Grammar Problem Revisited.
- Author
-
Hucke, Danny, Lohrey, Markus, and Reh, Carl Philipp
- Published
- 2016
- Full Text
- View/download PDF
13. Tree Compression Using String Grammars.
- Author
-
Ganardi, Moses, Hucke, Danny, Lohrey, Markus, and Noeth, Eric
- Published
- 2016
- Full Text
- View/download PDF
14. Constructing small tree grammars and small circuits for formulas.
- Author
-
Ganardi, Moses, Hucke, Danny, Jeż, Artur, Lohrey, Markus, and Noeth, Eric
- Subjects
- *
QUANTUM computing , *ALGORITHMS , *ELECTRIC circuits , *ARITHMETICAL algebraic geometry , *DATA compression - Abstract
It is shown that every tree of size n over a fixed set of σ different ranked symbols can be produced by a so called straight-line linear context-free tree grammar of size O ( n log σ n ) , which can be used as a compressed representation of the input tree. Two algorithms for computing such tree grammar are presented: One working in logarithmic space and the other working in linear time. As a consequence, it is shown that every arithmetical formula of size n , in which only m ≤ n different variables occur, can be transformed (in linear time as well as in logspace) into an arithmetical circuit of size O ( n ⋅ log m log n ) and depth O ( log n ) . This refines a classical result of Brent from 1974, according to which an arithmetical formula of size n can be transformed into a logarithmic depth circuit of size O ( n ) . [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
15. Approximation Ratios of RePair, LongestMatch and Greedy on Unary Strings †.
- Author
-
Hucke, Danny, Reh, Carl Philipp, and Miritescu, Catalina-Ana
- Subjects
- *
GREEDY algorithms , *APPROXIMATION algorithms , *DATA compression - Abstract
A grammar-based compressor is an algorithm that receives a word and outputs a context-free grammar that only produces this word. The approximation ratio for a single input word is the size of the grammar produced for this word divided by the size of a smallest grammar for this word. The worst-case approximation ratio of a grammar-based compressor for a given word length is the largest approximation ratio over all input words of that length. In this work, we study the worst-case approximation ratio of the algorithms Greedy, RePair and LongestMatch on unary strings, i.e., strings that only make use of a single symbol. Our main contribution is to show the improved upper bound of O ((log n) 8 · (log log n) 3) for the worst-case approximation ratio of Greedy. In addition, we also show the lower bound of 1.34847194 ⋯ for the worst-case approximation ratio of Greedy, and that RePair and LongestMatch have a worst-case approximation ratio of log 2 (3) . [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
16. Universal Tree Source Coding Using Grammar-Based Compression
- Author
-
Markus Lohrey, Danny Hucke, and Hucke, Danny
- Subjects
Discrete mathematics ,Pointwise ,FOS: Computer and information sciences ,Source code ,Binary tree ,Grammar ,Computer science ,Computer Science - Information Theory ,media_common.quotation_subject ,Information Theory (cs.IT) ,020206 networking & telecommunications ,0102 computer and information sciences ,02 engineering and technology ,Directed acyclic graph ,01 natural sciences ,Upper and lower bounds ,Redundancy (information theory) ,010201 computation theory & mathematics ,0202 electrical engineering, electronic engineering, information engineering ,[INFO.INFO-IT] Computer Science [cs]/Information Theory [cs.IT] ,68P30, 94A29 ,media_common - Abstract
We apply so-called tree straight-line programs to the problem of universal source coding for binary trees. We derive an upper bound on the maximal pointwise redundancy (or worst-case redundancy) that improve previous bounds on the average case redundancy obtained by Zhang, Yang, and Kieffer using directed acyclic graphs. Using this, we obtain universal codes for new classes of tree sources.
- Published
- 2017
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.