9 results on '"Neven, Frank"'
Search Results
2. Inference of concise regular expressions and DTDs
- Author
-
Bex, Geert Jan, Neven, Frank, Schwentick, Thomas, and Vansummeren, Stijn
- Subjects
Algorithm ,Data warehousing/data mining ,Artificial intelligence ,Algorithms -- Usage ,Data mining -- Methods ,Artificial intelligence -- Research ,Machine learning -- Research ,Document preparation -- Methods - Abstract
We consider the problem of inferring a concise Document Type Definition (DTD) for a given set of XML-documents, a problem that basically reduces to learning concise regular expressions from positive examples strings. We identify two classes of concise regular expressions--the single occurrence regular expressions (SOREs) and the chain regular expressions (CHAREs)--that capture the far majority of expressions used in practical DTDs. For the inference of SOREs we present several algorithms that first infer an automaton for a given set of example strings and then translate that automaton to a corresponding SORE, possibly repairing the automaton when no equivalent SORE can be found. In the process, we introduce a novel automaton to regular expression rewrite technique which is of independent interest. When only a very small amount of XML data is available, however (for instance when the data is generated by Web service requests or by answers to queries), these algorithms produce regular expressions that are too specific. Therefore, we introduce a novel learning algorithm CRx that directly infers CHAREs (which form a subclass of SOREs) without going through an automaton representation. We show that CRX performs very well within its target class on very small datasets. Categories and Subject Descriptors: F.4.3 [Mathematical Logic and Formal Languages]: Formal Languages; H.2.1 [Database Management]: Logical Design; 1.2.6 [Artificial Intelligence]: Learning; 1.7.2 [Document and Text Processing]: Document Preparation General Terms: Algorithms, Languages, Theory Additional Key Words and Phrases: Regular expressions, schema inference, XML DOI = 10.1145/1735886.1735890
- Published
- 2010
3. Expressiveness and complexity of XML publishing transducers
- Author
-
Fan, Wenfei, Geerts, Floris, and Neven, Frank
- Subjects
Electronic publication ,Electronic publishing ,XML ,Electronic publishing -- Analysis ,Transducers -- Analysis ,XML (Document markup language) -- Usage - Abstract
A number of languages have been developed for specifying XML publishing, that is, transformations of relational data into XML trees. These languages generally describe the behaviors of a middleware controller that builds an output tree iteratively, issuing queries to a relational source and expanding the tree with the query results at each step. To study the complexity and expressive power of XML publishing languages, this article proposes a notion of publishing transducers, which generate XML trees from relational data. We study a variety of publishing transducers based on what relational queries a transducer can issue, what temporary stores a transducer can use during tree generation, and whether or not some tree nodes are allowed to be virtual, that is, excluded from the output tree. We first show how existing XML publishing languages can be characterized by such transducers, and thus provide a synergy between theory and practice. We then study the membership, emptiness, and equivalence problems for various classes of transducers. We establish lower and upper bounds, all matching, ranging from PRIME to undecidable. Finally, we investigate the expressive power of these transducers and existing languages. We show that when treated as relational query languages, different classes of transducers capture either complexity classes (e.g., PSPACE) or fragments of datalog (e.g., linear datalog). For tree generation, we establish connections between publishing transducers and logical transductions, among other things. Categories and Subject Descriptors: H.2.5 [Database Management]: Heterogeneous Databases--Data translation; H.1.m [Models and Principles]: Miscellaneous General Terms: Design, Languages, Theory
- Published
- 2008
4. Expressiveness and complexity of XML Schema
- Author
-
Martens, Wim, Neven, Frank, Schwentick, Thomas, and Bex, Geert Jan
- Subjects
XML ,Algorithm ,XML (Document markup language) -- Analysis ,Algorithms -- Analysis - Abstract
The common abstraction of XML Schema by unranked regular tree languages is not entirely accurate. To shed some light on the actual expressive power of XML Schema, intuitive semantical characterizations of the Element Declarations Consistent (EDC) rule are provided. In particular, it is obtained that schemas satisfying EDC can only reason about regular properties of ancestors of nodes. Hence, with respect to expressive power, XML Schema is closer to DTDs than to tree automata. These theoretical results are complemented with an investigation of the XML Schema Definitions (XSDs) occurring in practice, revealing that the extra expressiveness of XSDs over DTDs is only used to a very limited extent. As this might be due to the complexity of the XML Schema specification and the difficulty of understanding the effect of constraints on typing and validation of schemas, a simpler formalism equivalent to XSDs is proposed. It is based on contextual patterns rather than on recursive types and it might serve as a light-weight front end for XML Schema. Next, the effect of EDC on the way XML documents can be typed is discussed. It is argued that a cleaner, more robust, larger but equally feasible class is obtained by replacing EDC with the notion of 1-pass preorder typing (1PPT): schemas that allow one to determine the type of an element of a streaming document when its opening tag is met. This notion can be defined in terms of grammars with restrained competition regular expressions and there is again an equivalent syntactical formalism based on contextual patterns. Finally, algorithms for recognition, simplification, and inclusion of schemas for the various classes are given. Categories and Subject Descriptors: H.2.1 [Database Management]: Logical Design; F.4.3 [Mathematical Logic and Formal Languages]: Formal Languages General Terms: Algorithms, Design, Languages, Standardization, Theory Additional Key Words and Phrases: XML, XML Schema, validation
- Published
- 2006
5. BonXai
- Author
-
Martens, Wim, primary, Neven, Frank, additional, Niewerth, Matthias, additional, and Schwentick, Thomas, additional
- Published
- 2017
- Full Text
- View/download PDF
6. Weaker Forms of Monotonicity for Declarative Networking
- Author
-
Ameloot, Tom J., primary, Ketsman, Bas, additional, Neven, Frank, additional, and Zinn, Daniel, additional
- Published
- 2015
- Full Text
- View/download PDF
7. Discovering XSD Keys from XML Data
- Author
-
Arenas, Marcelo, primary, Daenen, Jonny, additional, Neven, Frank, additional, Ugarte, Martin, additional, Bussche, Jan Van Den, additional, and Vansummeren, Stijn, additional
- Published
- 2014
- Full Text
- View/download PDF
8. Expressiveness and Complexity of XML Publishing Transducers.
- Author
-
WENFEI FAN, GEERTS, FLORIS, and NEVEN, FRANK
- Subjects
DATABASE management ,XML (Extensible Markup Language) ,TRANSDUCERS ,MIDDLEWARE ,QUERYING (Computer science) ,DATABASE searching - Abstract
A number of languages have been developed for specifying XML publishing, that is, transformations of relational data into XML trees. These languages generally describe the behaviors of a middleware controller that builds an output tree iteratively, issuing queries to a relational source and expanding the tree with the query results at each step. To study the complexity and expressive power of XML publishing languages, this article proposes a notion of publishing transducers, which generate XML trees from relational data. We study a variety of publishing transducers based on what relational queries a transducer can issue, what temporary stores a transducer can use during tree generation, and whether or not some tree nodes are allowed to be virtual, that is, excluded from the output tree. We first show how existing XML publishing languages can be characterized by such transducers, and thus provide a synergy between theory and practice. We then study the membership, emptiness, and equivalence problems for various classes of transducers. We establish lower and upper bounds, all matching, ranging from PTIME to undecidable. Finally, we investigate the expressive power of these transducers and existing languages. We show that when treated as relational query languages, different classes of transducers capture either complexity classes (e.g., PSPACE) or fragments of datalog (e.g., linear datalog). For tree generation, we establish connections between publishing transducers and logical transductions, among other things. [ABSTRACT FROM AUTHOR]
- Published
- 2008
- Full Text
- View/download PDF
9. BonXai
- Author
-
Frank Neven, Thomas Schwentick, Wim Martens, Matthias Niewerth, MARTENS, Wim, NEVEN, Frank, Niewerth, Matthias, and Schwentick, Thomas
- Subjects
Document Structure Description ,XML Encryption ,Schematron ,Computer science ,computer.internet_protocol ,Efficient XML Interchange ,XML ,BonXai ,XML Schema ,schema languages ,XML Signature ,02 engineering and technology ,computer.software_genre ,Logical schema ,XML Schema Editor ,020204 information systems ,Schema (psychology) ,Streaming XML ,0202 electrical engineering, electronic engineering, information engineering ,RELAX NG ,XML schema ,computer.programming_language ,Programming language ,cXML ,XML validation ,computer.file_format ,XML framework ,XML database ,XML Schema (W3C) ,Document Schema Definition Languages ,Star schema ,Document Definition Markup Language ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,020201 artificial intelligence & image processing ,Data mining ,computer ,Information Systems - Abstract
While the migration from DTD to XML Schema was driven by a need for increased expressivity and flexibility, the latter was also significantly more complex to use and understand. Whereas DTDs are characterized by their simplicity, XML Schema Documents are notoriously difficult. In this article, we introduce the XML specification language BonXai, which incorporates many features of XML Schema but is arguably almost as easy to use as DTDs. In brief, the latter is achieved by sacrificing the explicit use of types in favor of simple patterns expressing contexts for elements. The goal of BonXai is not to replace XML Schema but rather to provide a simpler alternative for users who want to go beyond the expressiveness and features of DTD but do not need the explicit use of types. Furthermore, XML Schema processing tools can be used as a back-end for BonXai, since BonXai can be automatically converted into XML Schema. A particularly strong point of BonXai is its solid foundation rooted in a decade of theoretical work around pattern-based schemas. We present a formal model for a core fragment of BonXai and the translation algorithms to and from a core fragment of XML Schema. We prove that BonXai and XML Schema can be converted back-and-forth on the level of tree languages and we formally study the size trade-offs between the two languages. We acknowledge the financial support of grant number MA 4938/2-1 from the Deutsche Forschungsgemeinschaft (Emmy Noether Nachwuchsgruppe). We further acknowledge the financial support of the Future and Emerging Technologies (FET) programme within the Seventh Framework Programme for Research of the European Commission, under the FET-Open grant agreement FOX, number FP7-ICT-233599.
- Published
- 2017
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.