6 results on '"queries"'
Search Results
2. Data Partitioning Methods to Process Queries on Encrypted Databases on the Cloud
- Author
-
Omran, Osama M
- Subjects
- Applied sciences, Cloud, Database security, Databases, Encrypted, Partitioning, Queries, Databases and Information Systems
- Abstract
Many features and advantages have been brought to organizations and computer users by Cloud computing. It allows different service providers to distribute many applications and services in an economical way. Consequently, many users and companies have begun using cloud computing. However, the users and companies are concerned about their data when data are stored and managed in the Cloud or outsourcing servers. The private data of individual users and companies is stored and managed by the service providers on the Cloud, which offers services on the other side of the Internet in terms of its users, and consequently results in privacy concerns [61]. In this dissertation, a technique has been explored to improve query processing performance while protecting database tables on a Cloud by encrypting those so that they remain secure. It shows how to process SQL queries on encrypted databases designed to protect data from any leakage or attack, even from the service providers. The strategy is to process the query on the Cloud without having to decrypt the data, and data decryption is performed only at the client site. Therefore, to achieve efficiency, no more than the exact set of requested data is returned to the client. In addition, four different techniques have been developed to index and partition the data. The indexes and partitions of the data are used to select part of the data from the Cloud or outsource data depending on the required data. The index data can be stored on the Cloud or server with the encrypted database table. This helps in reducing the entire processing time, which includes data transfer time from the Cloud to the client and also data decryption and processing time at the client.
- Published
- 2016
3. Benchmarking Performance for Migrating a Relational Application to a Parallel Implementation
- Author
-
Gadiraju, Krishna Karthik
- Subjects
- Computer Science, Hive, Hadoop, benchmarking, big data, SQL, queries
- Abstract
Many organizations rely on relational database platforms for OLAP-style querying (aggregation and filtering) for small to medium size applications. We investigate the impact of scaling up the data sizes for such queries. We intend to illustrate what kind of performance results an organization could expect should they migrate current applications to big data environments. This thesis benchmarks the performance of Hive, a parallel data warehouse platform that is a part of the Hadoop software stack. We set up a 4-node Hadoop cluster using Hortonworks HDP 1.3.2. We use the data generator provided by the TPC-DS benchmark to generate data of different scales. We use a representative query provided in the TPC-DS query set and run the SQL and Hive Query Language (HiveQL) versions of the same query on a relational database installation (MySQL) and on the Hive cluster. An analysis of the results shows that for all the dataset sizes used, Hive is faster than MySQL when executing the query. Hive loads the large datasets faster than MySQL, while it is marginally slower than MySQL when loading the smaller datasets.
- Published
- 2014
4. An Empirical Study of Novel Approaches to Dimensionality Reduction and Applications
- Author
-
Nsang, Augustine S.
- Subjects
- Computer Science, dimensionality reduction, random projections, clustering, classification, queries, web data
- Abstract
Dimensionality reduction is becoming increasingly important in the field of machine learning. In this thesis, we examine several traditional methods of dimensionality reduction, which include random projections, principal component analysis, singular value decomposition, kernel principal component analysis and discrete cosine transform. We also examine several existing applications of random projections (or dimensionality reduction, in general).In their paper, Random projections in dimensionality reduction: Applications to image and text data (2001), Bingham and Manilla suggest the use of random projections for query matching in a situation where a set of documents, instead of one particular one, were searched for. This suggests another application of random projections, namely to reduce the complexity of the query process. In this thesis, we explain why this approach fails, and suggest three alternative approaches to reducing the complexity of the query process using dimensionality reduction. We also outline query-based dimensionality reduction methods that can be used for image and web data.In each of the traditional approaches to dimensionality reduction (named above), each attribute in the reduced set is actually a linear combination of the attributes in the original data set. In this thesis, we take the position that true dimensionality reduction is obtained when the set of attributes in the reduced set is a proper subset of the attributes in the original data set, and we discuss seven novel approaches which satisfy this requirement. Using these seven approaches, as well as the RP and PCA approaches, we discuss several ways in which dimensionality reduction can be used for high dimensional clustering and classification.
- Published
- 2011
5. Enhanced Web Search Engines with Query-Concept Bipartite Graphs
- Author
-
Chen, Yan
- Subjects
- Queries, Query-concept bipartite graph, Web search engine, Text mining, Computational intelligence., Computer Sciences
- Abstract
With rapid growth of information on the Web, Web search engines have gained great momentum for exploiting valuable Web resources. Although keywords-based Web search engines provide relevant search results in response to users’ queries, future enhancement is still needed. Three important issues include (1) search results can be diverse because ambiguous keywords in queries can be interpreted to different meanings; (2) indentifying keywords in long queries is difficult for search engines; and (3) generating query-specific Web page summaries is desirable for Web search results’ previews. Based on clickthrough data, this thesis proposes a query-concept bipartite graph for representing queries’ relations, and applies the queries’ relations to applications such as (1) personalized query suggestions, (2) long queries Web searches and (3) query-specific Web page summarization. Experimental results show that query-concept bipartite graphs are useful for performance improvement for the three applications.
- Published
- 2010
6. Efficient evaluation of XML queries.
- Author
-
Paparizos, Stylianos
- Subjects
- Databases, Efficient, Evaluation, Queries, Tree Logical Classes, Xml, Xqueries, Xquery
- Abstract
The contributions in this thesis focus on processing XML queries using an algebra and on exploiting optimization opportunities to enhance execution performance. Algebraic native XQuery implementation has focused on efficient evaluation of XPath expressions. However, little has been done on efficient evaluation of queries as a whole. As our first contribution, we introduce the Generalized Tree Pattern (GTP) as a concise representation of XQueries. Evaluating the query reduces to finding matches for its GTP. Using this idea we develop efficient evaluation plans that significantly outperform the competition. A bulk algebra requires manipulation of sets of objects that are structurally homogeneous; but this statement is in contrast with the nature of XML query processing. We address this problem with the introduction of Annotated Pattern Trees and Tree Logical Classes. We show that it is possible to define bulk operations on structurally heterogeneous sets of trees by inducing homogeneity through a tree logical class reduction. We define a Tree Logical Class (TLC) algebra and demonstrate its utility in evaluating XQuery. We show that TLC produces better performing evaluation plans than competing tree algebra techniques. XML and XQuery semantics are order sensitive. The order is determined upon elaborate explicit and implicit parameters. Determining the correct output order, as well as the order of each operator while optimizing the placement of SORTs is a non-trivial procedure that can significantly affect the query evaluation performance. Our solution uses Hybrid Collections annotated with Ordering and Duplicate Specifications. We show how we produce the correct output order while allowing for algebraic rewrites that enhance performance. XML is treated as schema-less to allow for evaluation techniques to be executed in the absence of schema. We assume schema knowledge exists and try to explore using it as a performance enhancing tool. For our last contribution, we show practical structures to store metadata knowledge, the Schema Information Graph (SIG) and the Alternate Paths, and algorithms that take advantage of them within the constraints of an optimizer. Optimized plans are shown to significantly outperform 'naive' ones.
- Published
- 2006
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.