5 results on '"Yu, Jeffrey X."'
Search Results
2. Revisit on View Maintenance in Data Warehouses.
- Author
-
Liang, Weifa and Yu, Jeffrey X.
- Abstract
The complete consistence maintenance of SPJ-type materialized views in a distributed source environment has been studied extensively in the past several years due to its fundamental importance to data warehouses. Much effort has been taken based on an assumption that each source site contains only one relation and no multiple appearances of a relation is allowed in the definition of views. In this paper a generalized version of the view maintenance problem that not only a relation may appear many times in the definition of the view but also a site may contain multiple relations is considered. Due to unpredictability of the communication delay and bandwidth between the data warehouse and the sources, the materialized view maintenance is very expensive and time consuming. Therefore, one natural question for this generalized case is whether there is an algorithm which not only keeps the view complete consistent with the remote source data but also minimizes the number of accesses to the remote sites. In this paper we first show that a known SWEEP algorithm is one of the best algorithms for the case where multiple relations are included in a site. We then propose a complete consistency algorithm which accesses remote sources less than n-1 times for the case where multiple appearances of a relation is allowed and n is the number of relations in the definition of the view. [ABSTRACT FROM AUTHOR]
- Published
- 2001
- Full Text
- View/download PDF
3. A Break for Workaholics: Energy-Efficient Selective Tuning Mechanisms for Demand-Driven-Based Wireless Environment.
- Author
-
Kian-Lee Tan and Yu, Jeffrey X.
- Subjects
WORKAHOLICS ,ENERGY consumption ,FREQUENCY tuning ,WIRELESS communications ,ARTIFICIAL intelligence - Published
- 1997
4. Text Classification without Negative Examples Revisit.
- Author
-
Gabriel Pui Cheong Fung, Yu, Jeffrey X., Hongjun Lu, and Yu, Philip S.
- Subjects
OPERATIONS research ,KNOWLEDGE management ,METHODOLOGY ,PROBABILITY theory ,BENCHMARKING (Management) ,DATA mining - Abstract
Traditionally, building a classifier requires two sets of examples: positive examples and negative examples. This paper studies the problem of building a text classifier using positive examples (P) and unlabeled examples (U). The unlabeled examples are mixed with both positive and negative examples. Since no negative example is given explicitly, the task of building a reliable text classifier becomes far more challenging. Simply treating all of the unlabeled examples as negative examples and building a classifier thereafter is undoubtedly a poor approach to tackling this problem. Generally speaking, most of the studies solved this problem by a two-step heuristic: First, extract negative examples (N) from U. Second, build a classifier based on P and N. Surprisingly, most studies did not try to extract positive examples from U. Intuitively, enlarging P by P' (positive examples extracted from U) and building a classifier thereafter should enhance the effectiveness of the classifier. Throughout our study, we find that extracting P' is very difficult. A document in U that possesses the features exhibited in P does not necessarily mean that it is a positive example, and vice versa. The very large size of and very high diversity in U also contribute to the difficulties of extracting P'. In this paper, we propose a labeling heuristic called PNLH to tackle this problem. PNLH aims at extracting high quality positive examples and negative examples from U and can be used on top of any existing classifiers. Extensive experiments based on several benchmarks are conducted. The results indicated that PNLH is highly feasible, especially in the situation where ∣P∣ is extremely small. [ABSTRACT FROM AUTHOR]
- Published
- 2006
- Full Text
- View/download PDF
5. Optimizing multiple dimensional queries simultaneously in multidimensional databases.
- Author
-
Liang, Weifa, Orlowska, Maria E., and Yu, Jeffrey X.
- Abstract
Some significant progress related to multidimensional data analysis has been achieved in the past few years, including the design of fast algorithms for computing datacubes, selecting some precomputed group-bys to materialize, and designing efficient storage structures for multidimensional data. However, little work has been carried out on multidimensional query optimization issues. Particularly the response time (or evaluation cost) for answering several related dimensional queries simultaneously is crucial to the OLAP applications. Recently, Zhao et al. first exploited this problem by presenting three heuristic algorithms. In this paper we first consider in detail two cases of the problem in which all the queries are either hash-based star joins or index-based star joins only. In the case of the hash-based star join, we devise a polynomial approximation algorithm which delivers a plan whose evaluation cost is $ O(n^{\epsilon }$) times the optimal, where n is the number of queries and $ \epsilon $ is a fixed constant with $0<\epsilon \leq 1$. We also present an exponential algorithm which delivers a plan with the optimal evaluation cost. In the case of the index-based star join, we present a heuristic algorithm which delivers a plan whose evaluation cost is n times the optimal, and an exponential algorithm which delivers a plan with the optimal evaluation cost. We then consider a general case in which both hash-based star-join and index-based star-join queries are included. For this case, we give a possible improvement on the work of Zhao et al., based on an analysis of their solutions. We also develop another heuristic and an exact algorithm for the problem. We finally conduct a performance study by implementing our algorithms. The experimental results demonstrate that the solutions delivered for the restricted cases are always within two times of the optimal, which confirms our theoretical upper bounds. Actually these experiments produce much better results than our theoretical estimates. To the best of our knowledge, this is the only development of polynomial algorithms for the first two cases which are able to deliver plans with deterministic performance guarantees in terms of the qualities of the plans generated. The previous approaches including that of [ZDNS98] may generate a feasible plan for the problem in these two cases, but they do not provide any performance guarantee, i.e., the plans generated by their algorithms can be arbitrarily far from the optimal one. [ABSTRACT FROM AUTHOR]
- Published
- 2000
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.