13 results on '"Xml filtering"'
Search Results
2. XML Filtering
- Author
-
LIU, LING, editor and ÖZSU, M. TAMER, editor
- Published
- 2009
- Full Text
- View/download PDF
3. FoXtrot: Distributed Structural and Value XML Filtering.
- Author
-
Miliaraki, Iris and Koubarakis, Manolis
- Subjects
COMPUTER programming ,XML (Extensible Markup Language) ,INFORMATION filtering ,QUERY (Information retrieval system) ,HASHING ,PERFORMANCE evaluation - Abstract
Publish/subscribe systems have emerged in recent years as a promising paradigm for offering various popular notification services. In this context, many XML filtering systems have been proposed to efficiently identify XML data that matches user interests expressed as queries in an XML query language like XPath. However, in order to offer XML filtering functionality on an Internet-scale, we need to deploy such a service in a distributed environment, avoiding bottlenecks that can deteriorate performance. In this work, we design and implement FoXtrot, a system for filtering XML data that combines the strengths of automata for efficient filtering and distributed hash tables for building a fully distributed system. Apart from structural-matching, performed using automata, we also discuss different methods for evaluating value-based predicates. We perform an extensive experimental evaluation of our system, FoXtrot, on a local cluster and on the PlanetLab network and demonstrate that it can index millions of user queries, achieving a high indexing and filtering throughput. At the same time, FoXtrot exhibits very good load-balancing properties and improves its performance as we increase the size of the network. [ABSTRACT FROM AUTHOR]
- Published
- 2012
- Full Text
- View/download PDF
4. Efficient Filtering of Branch Queries for High-Performance XML Data Services.
- Author
-
Choi, Ryan H. and Wong, Raymond K.
- Subjects
XML (Extensible Markup Language) ,WEB development ,CASCADING style sheets ,QUERY languages (Computer science) ,REVERSE engineering ,PROGRAMMING languages ,HYPERTEXT systems ,WORLD Wide Web ,COMPUTER networks - Abstract
Efficient XML filtering has been the fundamental technique in recent Web service and XML publish/subscribe applications. In this article, we consider the problem of filtering a streaming XML data efficiently against a large number of branch XPath queries. To improve the performance of XML filtering, branch queries are grouped into similar queries, and the common paths between queries in the same group are identified. After performing structural matching of queries, queries are organized in a way that multiple queries can be evaluated simultaneously in the post-processing phase. in the post-processing phase, join operations are executed in a pipeline fashion, and intermediate join results are shared amongst the queries in the same group. As a result, the total number of join operations performed in the post-processing phase is signiflcantly reduced. In addition, we also present how to efficiently return all matching elements for each matching branch query. Experiments show that our proposal is efficient and scalable compared to previous work. [ABSTRACT FROM AUTHOR]
- Published
- 2009
- Full Text
- View/download PDF
5. Scalable Filtering of Multiple Generalized-Tree-Pattern Queries over XML Streams.
- Author
-
Songting Chen, Hua-Gang Li, Tatemura, Jun'ichi, Wang-Pin Hsiung, Agrawal, Divyakant, and Candan, K. Selçuk
- Subjects
- *
INFORMATION storage & retrieval systems , *XML (Extensible Markup Language) , *ALGORITHMS , *ENCODING , *INFORMATION filtering , *INFORMATION retrieval , *ALGEBRA , *METHODOLOGY , *COMPUTER systems - Abstract
An XML publish/subscribe system needs to filter a large number of queries over XML streams. Most existing systems only consider filtering the simple XPath statements. In this paper, we focus on filtering of the more complex Generalized-Tree-Pattern (GTP) queries. Our filtering mechanism is based on a novel Tree-of-Path (TOP) encoding scheme, which compactly represents the path matches for the entire document. First, we show that the TOP encodings can be efficiently produced via shared bottom-up path matching. Second, with the aid of this TOP encoding, we can 1) achieve polynomial time and space complexity for postprocessing, 2) avoid redundant predicate evaluations, 3) allow an efficient duplicate-free and merge join-based algorithm for merging multiple encoded path matches, and 4) simplify the processing of GTP queries. Overall, our approach maximizes the sharing opportunity across queries by exploiting the suffix as well as prefix sharing. At the same time, our TOP encodings allow efficient postprocessing for GTP queries. Extensive performance studies show that GFilter not only achieves significantly better filtering performance than state-of-the-art algorithms but also is capable of efficiently filtering the more complex GTP queries. [ABSTRACT FROM AUTHOR]
- Published
- 2008
- Full Text
- View/download PDF
6. Value-based predicate filtering of XML documents
- Author
-
Kwon, Joonho, Rao, Praveen, Moon, Bongki, and Lee, Sukho
- Subjects
- *
XML (Extensible Markup Language) , *INFORMATION filtering , *SUBSCRIPTION services , *XPATH (Computer program language) - Abstract
Abstract: In recent years, publish–subscribe systems based on XML filtering have received much attention in ubiquitous computing environments and Internet applications. The main challenge is to process a large number of content against millions of user subscriptions. Several XML filtering systems focus on the efficient processing of structural matching of user subscriptions represented as XPath twig patterns. However, existing techniques provide limited or no support for twig patterns that contain various operators in the value-based predicates. In this paper, we present the pFiST system that filters XML documents by transforming twig patterns into sequences based on Prüfer’s method. This sequencing idea for XML filtering was first demonstrated by FiST [J. Kwon, P. Rao, B. Moon, S. Lee, FiST: scalable XML document filtering by sequencing twig patterns, in: Proceedings of the 31st VLDB Conference, Trondheim, Norway, 2005, pp. 217–228]. The focus of pFiST is to support value-based predicates in twig patterns in addition to matching their structure. The pFiST system supports equality and non-equality operators, and in addition can handle logical operators such as AND and OR in the value-based predicates. Extensive experimental results show that pFiST provides good performance over data sets with different characteristics. [Copyright &y& Elsevier]
- Published
- 2008
- Full Text
- View/download PDF
7. Cache-Conscious Automata for XML Filtering.
- Author
-
Bingsheng He, Qiong Luo, and Byron Choi
- Subjects
- *
CACHE memory , *COMPUTER input-output equipment , *XML (Extensible Markup Language) , *FILTERING software , *COMPUTER software , *ACCESS control , *COMPUTER storage devices , *LINE drivers (Integrated circuits) , *COMPUTER interfaces , *MACHINE theory - Abstract
Hardware cache behavior is an important factor in the performance of memory-resident, data-intensive systems such as XML filtering engines. A key data structure in several recent XML filters is the automaton, which is used to represent the long-running XML queries in the main memory. In this paper, we study the cache performance of automaton-based XML filtering through analytical modeling and system measurement. Furthermore, we propose a cache-conscious automaton organization technique, called the hot buffer, to improve the locality of automaton state transitions. Our results show that 1) our cache performance model for XML filtering automata is highly accurate and 2) the hot buffer improves the cache performance as well as the overall performance of automaton-based XML filtering. [ABSTRACT FROM AUTHOR]
- Published
- 2006
- Full Text
- View/download PDF
8. Rule-based Methodologies for the Specification and Analysis of Complex Computing Systems
- Author
-
Michele Baggi ., Alpuente Frasnedo, María, Falaschi, Moreno, and Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació
- Subjects
System of systems ,Domain-specific language ,Distributed Computing Environment ,Theoretical computer science ,business.industry ,Computer science ,Formal methods ,Big data ,Rule-based system ,Software ,Xml filtering ,Formal specification ,Program transformation ,Web verification ,The Internet ,Systems biology ,business ,Software engineering ,LENGUAJES Y SISTEMAS INFORMATICOS ,Rule-based methodologies - Abstract
Desde los orígenes del hardware y el software hasta la época actual, la complejidad de los sistemas de cálculo ha supuesto un problema al cual informáticos, ingenieros y programadores han tenido que enfrentarse. Como resultado de este esfuerzo han surgido y madurado importantes áreas de investigación. En esta disertación abordamos algunas de las líneas de investigación actuales relacionada con el análisis y la verificación de sistemas de computación complejos utilizando métodos formales y lenguajes de dominio específico. En esta tesis nos centramos en los sistemas distribuidos, con un especial interés por los sistemas Web y los sistemas biológicos. La primera parte de la tesis está dedicada a aspectos de seguridad y técnicas relacionadas, concretamente la certificación del software. En primer lugar estudiamos sistemas de control de acceso a recursos y proponemos un lenguaje para especificar políticas de control de acceso que están fuertemente asociadas a bases de conocimiento y que proporcionan una descripción sensible a la semántica de los recursos o elementos a los que se accede. También hemos desarrollado un marco novedoso de trabajo para la Code-Carrying Theory, una metodología para la certificación del software cuyo objetivo es asegurar el envío seguro de código en un entorno distribuido. Nuestro marco de trabajo está basado en un sistema de transformación de teorías de reescritura mediante operaciones de plegado/desplegado. La segunda parte de esta tesis se concentra en el análisis y la verificación de sistemas Web y sistemas biológicos. Proponemos un lenguaje para el filtrado de información que permite la recuperación de informaciones en grandes almacenes de datos. Dicho lenguaje utiliza información semántica obtenida a partir de ontologías remotas para re nar el proceso de filtrado. También estudiamos métodos de validación para comprobar la consistencia de contenidos web con respecto a propiedades sintácticas y semánticas. Otra de nuestras contribuciones es la propuesta de un lenguaje que permite definir y comprobar automáticamente restricciones semánticas y sintácticas en el contenido estático de un sistema Web. Finalmente, también consideramos los sistemas biológicos y nos centramos en un formalismo basado en lógica de reescritura para el modelado y el análisis de aspectos cuantitativos de los procesos biológicos. Para evaluar la efectividad de todas las metodologías propuestas, hemos prestado especial atención al desarrollo de prototipos que se han implementado utilizando lenguajes basados en reglas., Baggi ., M. (2010). Rule-based Methodologies for the Specification and Analysis of Complex Computing Systems [Tesis doctoral no publicada]. Universitat Politècnica de València. doi:10.4995/Thesis/10251/8964.
- Published
- 2015
- Full Text
- View/download PDF
9. Chimera: Stream-Oriented XML Filtering/Querying Engine
- Author
-
Tatsuya Asai, Shinichiro Tago, Seishi Okamoto, Hiroya Inakoshi, and Masayuki Takeda
- Subjects
Event stream ,Computer science ,computer.internet_protocol ,XPath 2.0 ,InformationSystems_DATABASEMANAGEMENT ,Xml filtering ,computer.software_genre ,Xml data ,Relational database management system ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,Boolean expression ,Data mining ,computer ,Identity transform ,computer.programming_language ,XPath - Abstract
In this paper, we study the problem of filtering and querying massive XML data against a large set of XPath patterns in Univariate XPath. Based on an efficient matching engine XSIGMA for linear XPath patterns with Boolean expression over keywords and a twig evaluator over event streams, we propose an XPath filtering/querying engine Chimera, which runs fast and stably for any XPath patterns without heavy pre- processing techniques for queried data often used by existing native XMLDBs and RDBs. Chimera also runs much faster than those engines against thousands of XPath patterns. We implemented Chimera and showed its effectiveness by several experiments on artificial and real datasets.
- Published
- 2010
- Full Text
- View/download PDF
10. A model for aggregation and filtering on encrypted XML streams in fog computing.
- Author
-
Huang, Jyun-Yao, Hong, Wei-Chih, Tsai, Po-Shin, and Liao, I-En
- Subjects
- *
INTERNET of things , *INNOVATION adoption , *ATTITUDES toward technology , *DATA encryption , *XML (Extensible Markup Language) , *ECONOMICS - Abstract
The Internet of Things provides visions of innovative services and domain-specific applications. With the development of Internet of Things services, various structural data need to be transferred over the Internet. However, protecting structural information that contains sensitive data has raised concerns against Internet of Things services. For a publish/subscribe scenario consisting of sensors, fog nodes, and subscribers, we propose a model that (1) expands the present XML Encryption standard for data with string and numeric types implemented in the sensors, (2) efficiently and discreetly filters matched streaming data and performs summation in the fog nodes, and (3) decrypts the filtered and aggregated data in the subscribers without revealing privacy data. The experimental results of the performance on fog node implemented by PC or Raspberry Pi show that the proposed model can rapidly process multiple encrypted XML streams generated by sensors in a parallel manner without revealing privacy data to subscribers. Therefore, the proposed model is a solution to the fog computing applications in which the privacy preservation of sensor data is of great concern. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
11. Rule-based Methodologies for the Specification and Analysis of Complex Computing Systems
- Author
-
Alpuente Frasnedo, María, Falaschi, Moreno, Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació, Baggi ., Michele, Alpuente Frasnedo, María, Falaschi, Moreno, Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació, and Baggi ., Michele
- Abstract
Desde los orígenes del hardware y el software hasta la época actual, la complejidad de los sistemas de cálculo ha supuesto un problema al cual informáticos, ingenieros y programadores han tenido que enfrentarse. Como resultado de este esfuerzo han surgido y madurado importantes áreas de investigación. En esta disertación abordamos algunas de las líneas de investigación actuales relacionada con el análisis y la verificación de sistemas de computación complejos utilizando métodos formales y lenguajes de dominio específico. En esta tesis nos centramos en los sistemas distribuidos, con un especial interés por los sistemas Web y los sistemas biológicos. La primera parte de la tesis está dedicada a aspectos de seguridad y técnicas relacionadas, concretamente la certificación del software. En primer lugar estudiamos sistemas de control de acceso a recursos y proponemos un lenguaje para especificar políticas de control de acceso que están fuertemente asociadas a bases de conocimiento y que proporcionan una descripción sensible a la semántica de los recursos o elementos a los que se accede. También hemos desarrollado un marco novedoso de trabajo para la Code-Carrying Theory, una metodología para la certificación del software cuyo objetivo es asegurar el envío seguro de código en un entorno distribuido. Nuestro marco de trabajo está basado en un sistema de transformación de teorías de reescritura mediante operaciones de plegado/desplegado. La segunda parte de esta tesis se concentra en el análisis y la verificación de sistemas Web y sistemas biológicos. Proponemos un lenguaje para el filtrado de información que permite la recuperación de informaciones en grandes almacenes de datos. Dicho lenguaje utiliza información semántica obtenida a partir de ontologías remotas para re nar el proceso de filtrado. También estudiamos métodos de validación para comprobar la consistencia de contenidos web con respecto a propiedades sintácticas y semánticas. Otra de nuestras contribuciones es
- Published
- 2010
12. Cache-conscious Automata for XML Filtering
- Author
-
He, Bingsheng, Luo, Qiong, Choi, Byron, He, Bingsheng, Luo, Qiong, and Choi, Byron
- Abstract
Hardware cache behavior is an important factor in the performance of memory-resident, data-intensive systems such as XML filtering engines. A key data structure in several recent XML filters is the automaton, which is used to represent the long-running XML queries in the main memory. In this paper, we study the cache performance of automaton-based XML filtering through analytical modeling and system measurement. Furthermore, we propose a cache-conscious automaton organization technique, called the hot buffer, to improve the locality of automaton state transitions. Our results show that 1) our cache performance model for XML filtering automata is highly accurate and 2) the hot buffer improves the cache performance as well as the overall performance of automaton-based XML filtering.
- Published
- 2006
13. XFIS: an XML filtering system based on string representation and matching.
- Author
-
Antonellis, Panagiotis and Makris, Christos
- Subjects
INFORMATION filtering ,XML (Extensible Markup Language) ,XPATH (Computer program language) ,HEURISTIC programming ,ALGORITHMS - Abstract
Information-filtering systems constitute a critical component of modern information-seeking applications. As the number of users grows and the amount of information available becomes even bigger, it is imperative to employ scalable and efficient representation and filtering techniques. Typically, the use of eXtensible Markup Language (XML) representation entails profile representation with the use of the XPath query language and the employment of efficient heuristic techniques for constraining the complexity of the filtering mechanism. In this paper, we propose an efficient technique for matching user profiles that is based on the use of holistic twig-matching algorithms and is more effective, in terms of time and space complexities, in comparison with previous techniques. The proposed algorithm is able to handle order matching of user profiles, while its main positive aspect is the envisaging of a representation based on Prufer sequences that permits the effective investigation of node relationships. Experimental results showed that the proposed algorithm outperforms the previous algorithms in XML filtering both in space and time aspects. [ABSTRACT FROM AUTHOR]
- Published
- 2008
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.