Author: "Kyu-Young Whang" / Language: english - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Kyu-Young Whang"' showing total 43 results

Start Over Author "Kyu-Young Whang" Language english

43 results on '"Kyu-Young Whang"'

1. A top-down approach for density-based clustering using multidimensional indexes

Author: Jae-Joon Hwang, Kyu-Young Whang, Yang-Sae Moon, and Byung-Suk Lee
Subjects: Database, CD-ROM database, CD-ROM catalog, Server clustering, Data warehousing/data mining, Databases -- Analysis, Clustering (Computers) -- Analysis, Data mining -- Analysis
Published: 2004

2. Progressive Processing of Continuous Range Queries in Hierarchical Wireless Sensor Networks

Author: Byung Suk Lee, Kyu-Young Whang, Jun-Seok Heo, Jeong-Hoon Lee, and Hyo-Sang Lim
Subjects: FOS: Computer and information sciences, Ubiquitous computing, Wireless network, Computer science, Visual sensor network, Distributed computing, Real-time computing, Databases (cs.DB), Key distribution in wireless sensor networks, Computer Science - Databases, Artificial Intelligence, Hardware and Architecture, Mobile wireless sensor network, Computer Vision and Pattern Recognition, Electrical and Electronic Engineering, Hierarchical network model, Wireless sensor network, Software, Energy (signal processing)
Abstract: In this paper, we study the problem of processing continuous range queries in a hierarchical wireless sensor network. Contrasted with the traditional approach of building networks in a "flat" structure using sensor devices of the same capability, the hierarchical approach deploys devices of higher capability in a higher tier, i.e., a tier closer to the server. While query processing in flat sensor networks has been widely studied, the study on query processing in hierarchical sensor networks has been inadequate. In wireless sensor networks, the main costs that should be considered are the energy for sending data and the storage for storing queries. There is a trade-off between these two costs. Based on this, we first propose a progressive processing method that effectively processes a large number of continuous range queries in hierarchical sensor networks. The proposed method uses the query merging technique proposed by Xiang et al. as the basis and additionally considers the trade-off between the two costs. More specifically, it works toward reducing the storage cost at lower-tier nodes by merging more queries, and toward reducing the energy cost at higher-tier nodes by merging fewer queries (thereby reducing "false alarms"). We then present how to build a hierarchical sensor network that is optimal with respect to the weighted sum of the two costs. It allows for a cost-based systematic control of the trade-off based on the relative importance between the storage and energy in a given network environment and application. Experimental results show that the proposed method achieves a near-optimal control between the storage and energy and reduces the cost by 0.989~84.995 times compared with the cost achieved using the flat (i.e., non-hierarchical) setup as in the work by Xiang et al., 41 pages, 20 figures
Published: 2009

3. An apples-to-apples comparison of two database journals

Author: Holger Meyer, M. Tamer Özsu, Kyu-Young Whang, Christian S. Jensen, Philip A. Bernstein, Elisa Bertino, Andreas Heuer, and Richard T. Snodgrass
Subjects: Very large database, Information retrieval, Database, Computer science, computer.software_genre, Database design, computer, Software, Information Systems
Abstract: This paper defines a collection of metrics on manuscript reviewing and presents historical data for ACM Transactions on Database Systems and The VLDB Journal.
Published: 2005

4. A Full-Coverage Two-Level URL Duplication Checking Method for a High-Speed Parallel Web Crawler.

Author: YOUNUS, ARJUMAND, KYU-YOUNG WHANG, HYUK-YOON KWON, and YEON-MI YEO
Subjects: UNIFORM Resource Locators, COST analysis, WEBSITES, COMPUTER network resources, APPROXIMATION theory
Abstract: For efficient large-scale Web crawlers, URL duplication checking is an important technique since it is a significant bottleneck. In this paper, we propose a new URL duplication checking technique for a parallel Web crawler; we call it full-coverage two level URL duplication checking (full-coverage-2L-UDC). Full-coverage-2L-UDC provides efficient URL duplication checking while ensuring maximum coverage. First, we propose two-level URL duplication checking (2L-UDC). It provides efficiency in URL duplication checking by communicating at the Web site level rather than at the Web page level. Second, we present a solution for the so-called coverage problem, which is directly related to the recall of the search engine. It is the first solution for the coverage problem in the centralized parallel architecture. Third, we propose an architecture, FC2-LUDCbot, for a centralized parallel crawler using fiill-coverage-2L-UDC. We build a seven-agent FC2L-UDCbot for extensive experiments. We show that the crawling speed of FC2L-UDCbot is approximately proportional to the number of agents (i.e., FC2L-UDC-bot is faster than a single-machine crawler by 6.9 times). Full-coverage-2L-UDC allows FC2L-UDCbot to be scalable to the number of agents since it effectively deals with the overheads incurred in a parallel environment. Through an in-depth analysis, we construct a cost model for estimating the crawling speed of a scaled-up crawler. Using the model, we show that FC2L-UDCbot can crawl Google-scale Web pages within several days using dozens of agents. [ABSTRACT FROM AUTHOR]
Published: 2015

5. The Hybrid-Layer Index: A synergic approach to answering top-k queries in arbitrary subspaces.

Author: Jun-Seok Heo, Junghoo Cho, and Kyu-Young Whang
Published: 2010
Full Text: View/download PDF

6. An incremental clustering crawler for community-limited search.

Author: Gye-Jeong Kim, Kyu-Young Whang, Min-Soo Kim, Hyo-Sang Lim, Ki-Hoon Lee, and Byung Suk Lee
Published: 2009
Full Text: View/download PDF

7. Type-Level Access Pattern View: A Technique for Enhancing Prefetching Performance.

Author: Mong Li Lee, Kian Lee Tan, Wuwongse, Vilas, Wook-Shin Han, Woong-Kee Loh, and Kyu-Young Whang
Abstract: Navigational applications on Object-Relational DBMSs (ORDBMSs) access objects in the database related to one another via reference and collection attributes. When accessing an object, the applications first look up the object cache in the client and, if the object does not exist, fetch the object from the server. Prefetching is to identify the objects that are highly probable to be accessed in the future by the applications and to save these objects in the object cache in advance. Since prefetching reduces the number of high cost fetches, it is crucial for the performance of the applications. The prefetch method proposed by Han et al.[7] reduces the number of fetches by orders of magnitude compared with the previous methods. However, overall performance enhancement is not as significant as reduction of fetches. Since the performance of prefetching is determined by the number of disk accesses in the server as well as the number of fetches. In this paper, we propose a technique for minimizing the number of disk accesses to enhance the performance of the prefetch method proposed by Han et al. We propose a method for creating materialized views based on the type-level path access logs proposed by Han et al.[6]. We call the materialized view as the type-level access pattern view. We then present an algorithm for minimizing the number of disk accesses when prefetching the objects from the database in the server by using the type-level access pattern view. We perform a series of experiments using a variety of databases to show that the technique proposed in this paper significantly enhances the overall performance of the navigational applications. We show that the proposed technique reduces the number of disk accesses by up to 33.0 times and enhances the performance by up to 21.4 times compared with the original prefetch method by Han et al. Keywords: navigational application, prefetch method, type-level path access log, type-level access pattern view. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

8. An Efficient Algorithm for Computing Range-Groupby Queries.

Author: Mong Li Lee, Kian Lee Tan, Wuwongse, Vilas, Young-Koo Lee, Woong-Kee Loh, Yang-Sae Moon, Kyu-Young Whang, and Il-Yeol Song
Abstract: Aggregation queries for arbitrary regions in an n-dimensional space are powerful tools for data analysis in OLAP. A GROUP BY query in OLAP is very important since it allows us to summarize various trends along with any combination of dimensions. In this paper, we extend the previous aggregation queries by including the GROUP BY clause for arbitrary regions. We call the extension range-groupby queries and present an efficient algorithm for processing them. A typical method of achieving fast response time for aggregation queries is using the prefix-sum array, which stores precomputed partial aggregation values. A naive method for range-groupby queries maintains a prefix-sum array for each combination of the grouping dimensions in an n-dimensional cube, which incurs enormous storage overhead. Our algorithm maintains only one prefix-sum array and still effectively processes range-groupby queries for all possible combinations of multiple grouping dimensions. Compared with the naive method, our algorithm reduces the space overhead by , while accessing almost the identical number of cells. Keywords: range-groupby queries, aggregation queries, prefix-sum arrays, data cubes. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

9. Efficient Evaluation of Partial Match Queries for XML Documents Using Information Retrieval Techniques.

Author: Lizhu Zhou, Beng Chin Ooi, Xiaofeng Meng, Young-Ho Park, Kyu-Young Whang, Byung Suk Lee, and Wook-Shin Han
Abstract: We propose XIR, a novel method for processing partial match queries on heterogeneous XML documents using information retrieval (IR) techniques. A partial match query is defined as the one having the descendent-or-self axis "//" in its path expression. In its general form, a partial match query has branch predicates forming branching paths. The objective of XIR is to efficiently support this type of queries for large-scale documents of heterogeneous schemas. XIR has its basis on the conventional schema-level methods using relational tables and significantly improves their efficiency using two techniques: an inverted index technique and a novel prefix match join. The former indexes the labels in label paths as keywords in texts, and allows for finding the label paths matching the queries more efficiently than string match used in the conventional methods. The latter supports branching path expressions, and allows for finding the result nodes more efficiently than containment joins used in the conventional methods. We compare the efficiency of XIR with those of XRel and XParent using XML documents crawled from the Internet. The results show that XIR is more efficient than both XRel and XParent by several orders of magnitude for linear path expressions, and by several factors for branching path expressions. [ABSTRACT FROM AUTHOR]
Published: 2005
Full Text: View/download PDF

10. An update-risk based approach to TTL estimation in Web caching.

Author: Jeong-Joon Lee, Kyu-Young Whang, Byung Suk Lee, and Ji-Woong Chang
Published: 2002
Full Text: View/download PDF

11. Prefetching based on the type-level access pattern in object-relational DBMSs.

Author: Wook-Shin Han, Kyu-Young Whang, Yang-Sae Moon, and Il-Yeol Song
Published: 2001
Full Text: View/download PDF

12. Duality-based subsequence matching in time-series databases.

Author: Yang-Sae Moon, Kyu-Young Whang, and Woong-Kee Loh
Published: 2001
Full Text: View/download PDF

13. A cost-based replacement algorithm for object buffers.

Author: Chong-Mok Park, Kyu-Young Whang, Jeong-Joon Lee, and Il-Yeol Song
Published: 2000
Full Text: View/download PDF

14. 2D-CHI: a tunable two-dimensional class hierarchy index for object-oriented databases.

Author: Jong-Hak Lee, Kyu-Young Whang, Wook-Shin Han, Wan-Sup Cho, and Il-Yeol Song
Published: 2000
Full Text: View/download PDF

15. Database Systems for Advanced Applications : 9th International Conference, DASFAA 2004, Jeju Island, Korea, March 17-19, 2003, Proceedings

Author: YoonJoon Lee, Jianzhong Li, Kyu-Young Whang, Doheon Lee, YoonJoon Lee, Jianzhong Li, Kyu-Young Whang, and Doheon Lee
Subjects: Database management--Congresses
Published: 2004

16. The clustering property of corner transformation for spatial database applications.

Author: Ju-Won Song, Kyu-Young Whang, Young-Koo Lee, and Sang-Wook Kim
Published: 1999
Full Text: View/download PDF

17. Query Optimization Techniques Utilizing Path Indexes in Object-Oriented Database Systems.

Author: Wan-Sup Cho, Kyu-Young Whang, Seung-Sun Lee, and Yong-Ik Yoon
Subjects: QUERY (Information retrieval system), MATHEMATICAL optimization, DATABASES, SEARCH algorithms, GRAPH theory
Published: 1997

18. On analyzing the errors in a selectivity estimation method using a multidimensional file structure.

Author: Sang-Wook Kim, Whan-Kyu Whang, and Kyu-Young Whang
Published: 1998
Full Text: View/download PDF

19. A join algorithm utilizing multiple path indexes in object-oriented database systems.

Author: Wan-Sup Cho, Seung-Sun Lee, Yong-Ik Yoon, and Kyu-Young Whang
Published: 1996
Full Text: View/download PDF

20. Providing orthogonal persistence to C++ using forced inheritance.

Author: Chong-Mok Park, Kyu-Young Whang, Il-Yeol Song, and Navathe, S.
Published: 1994
Full Text: View/download PDF

21. XMin: Minimizing Tree Pattern Queries with Minimality Guarantee.

Author: Ki-Hoon Lee, Kyu-Young Whang, and Wook-Shin Han
Subjects: *XML (Extensible Markup Language), *QUERY languages (Computer science), *QUERY (Information retrieval system), *DOCUMENT markup languages, *PROGRAMMING languages
Abstract: Due to wide use of XPath, the problem of efficiently processing XPath queries has recently received a lot of attention. In particular, a considerable effort has been devoted to minimizing XPath queries since the efficiency of query processing greatly depends on the size of the query. Research work in this area can be classified into two categories: constraint-independent minimization and constraint-dependent minimization. The former minimizes queries in the absence of integrity constraints while the latter in the presence of them. For a linear path query, which is an XPath query without branching predicates, existing constraint-independent minimization methods are generally known to be unable to minimize the query without processing the query itself. Most recently, however, by using the DataGuide, a representative structural summary of XML data, a constraint-independent method that minimizes linear path queries in a top-down fashion has been proposed. Nevertheless, this method can fail to find a minimal query since it minimizes a query by merely erasing labels from the original query whereas a minimal query could include labels that are not present in the original query. In this paper, we propose a bottom-up approach called XMin that guarantees finding a minimal query for a given tree pattern query by using the DataGuide without processing the query itself. For the linear path query, we first show that the sequence of labels occurring in the minimal query is a subsequence of every schema label sequence that matches the original query. Here, the schema label sequence for a node is the sequence of labels from the root of XML data to the node. We then propose iterative subsequence generation that iteratively generates subsequences from the shortest schema label sequence matching the original query in a bottom-up fashion and tests query equivalence. Using iterative subsequence generation, we can always find a minimal query and we formally prove this guarantee. We also propose an extended algorithm that guarantees the minimality for the tree pattern query, which is a linear path query with branching predicates. These methods have been prototyped in a full-fledged object-relational DBMS. The experimental results using real and synthetic data sets show the practicality of our method. [ABSTRACT FROM AUTHOR]
Published: 2010
Full Text: View/download PDF

22. Structural consistency: enabling XML keyword search to eliminate spurious results consistently.

Author: Ki-Hoon Lee, Kyu-Young Whang, Wook-Shin Han, and Min-Soo Kim
Abstract: XML keyword search is a user-friendly way to query XML data using only keywords. In XML keyword search, to achieve high precision without sacrificing recall, it is important to remove spurious results not intended by the user. Efforts to eliminate spurious results have enjoyed some success using the concepts of LCA or its variants, SLCA and MLCA. However, existing methods still could find many spurious results. The fundamental cause for the occurrence of spurious results is that the existing methods try to eliminate spurious results locally without global examination of all the query results and, accordingly, some spurious results are not consistently eliminated. In this paper, we propose a novel keyword search method that removes spurious results consistently by exploiting the new concept of structural consistency. We define structural consistency as a property that is preserved if there is no query result having an ancestor-descendant relationship at the schema level with any other query results. A naive solution to obtain structural consistency would be to compute all the LCAs (or variants) and then to remove spurious results according to structural consistency. Obviously, this approach would always be slower than existing LCA-based ones. To speed up structural consistency checking, we must be able to examine the query results at the schema level without generating all the LCAs. However, this is a challenging problem since the schema-level query results do not homomorphically map to the instance-level query results, causing serious false dismissal. We present a comprehensive and practical solution to this problem and formally prove that this solution preserves structural consistency at the schema level without incurring false dismissal. We also propose a relevance-feedback-based solution for the problem where our method has low recall, which occurs when it is not the user’s intention to find more specific results. This solution has been prototyped in a full-fledged object-relational DBMS Odysseus developed at KAIST. Experimental results using real and synthetic data sets show that, compared with the state-of-the-art methods, our solution significantly (1) improves precision while providing comparable recall for most queries and (2) enhances the query performance by removing spurious results early. [ABSTRACT FROM AUTHOR]
Published: 2010
Full Text: View/download PDF

23. Structural optimization of a full-text n -gram index using relational normalization.

Author: Min-Soo Kim, Kyu-Young Whang, Jae-Gil Lee, and Min-Jae Lee
Abstract: Abstract As the amount of text data grows explosively, an efficient index structure for large text databases becomes ever important. The n-gram inverted index (simply, the n-gram index) has been widely used in information retrieval or in approximate string matching due to its two major advantages: language-neutral and error-tolerant. Nevertheless, the n-gram index also has drawbacks: the size tends to be very large, and the performance of queries tends to be bad. In this paper, we propose the two-level n-gram inverted index (simply, the n-gram/2L index) that significantly reduces the size and improves the query performance by using the relational normalization theory. We first identify that, in the (full-text) n-gram index, there exists redundancy in the position information caused by a non-trivial multivalued dependency. The proposed index eliminates such redundancy by constructing the index in two levels: the front-end index and the back-end index. We formally prove that this two-level construction is identical to the relational normalization process. We call this process structural optimization of the n-gram index. The n-gram/2L index has excellent properties: (1) it significantly reduces the size and improves the performance compared with the n-gram index with these improvements becoming more marked as the database size gets larger; (2) the query processing time increases only very slightly as the query length gets longer. Experimental results using real databases of 1 GB show that the size of the n-gram/2L index is reduced by up to 1.9–2.4 times and, at the same time, the query performance is improved by up to 13.1 times compared with those of the n-gram index. We also compare the n-gram/2L index with Makinen’s compact suffix array (CSA) (Proc. 11th Annual Symposium on Combinatorial Pattern Matching pp. 305–319, 2000) stored in disk. Experimental results show that the n-gram/2L index outperforms the CSA when the query length is short (i.e., less than 15–20), and the CSA is similar to or better than the n-gram/2L index when the query length is long (i.e., more than 15–20). [ABSTRACT FROM AUTHOR]
Published: 2008
Full Text: View/download PDF

24. Advances in Knowledge Discovery and Data Mining : 7th Pacific-Asia Conference, PAKDD 2003. Seoul, Korea, April 30 - May 2, 2003, Proceedings

Author: Kyu-Young Whang, Jongwoo Jeon, Kyuseok Shim, Jaideep Srivatava, Kyu-Young Whang, Jongwoo Jeon, Kyuseok Shim, and Jaideep Srivatava
Subjects: Database management--Congresses, Database searching--Congresses, Data mining--Congresses
Abstract: The 7th Paci?c Asia Conference on Knowledge Discovery and Data Mining (PAKDD) was held from April 30 to May 2, 2003 in the Convention and Ex- bition Center (COEX), Seoul, Korea. The PAKDD conference is a major forum for academic researchers and industry practitioners in the Paci?c Asia region to share original research results and development experiences from di?erent KDD-related areas such as data mining, data warehousing, machine learning, databases, statistics, knowledge acquisition and discovery, data visualization, and knowledge-based systems. The conference was organized by the Advanced Information Technology Research Center (AITrc) at KAIST and the Statistical Research Center for Complex Systems (SRCCS) at Seoul National University. It was sponsored by the Korean Datamining Society (KDMS), the Korea Inf- mation Science Society (KISS), the United States Air Force O?ce of Scienti?c Research, the Asian O?ce of Aerospace Research & Development, and KAIST. It was held with cooperation from ACM's Special Group on Knowledge Dis- very and Data Mining (SIGKDD).
Published: 2003

25. n-Gram/2L-approximation: a two-level n-gram inverted index structure for approximate string matching.

Author: Min-Soo Kim, Kyu-Young Whang, and Jae-Gil Lee
Subjects: APPROXIMATION theory, ALGORITHMS, INVERTED indexes, DATABASES, INDEXES, FUNCTIONAL analysis
Abstract: Approximate string matching is to find all the occurrences of a query string in a text database allowing a specified number of errors. Approximate string matching based on the n-gram inverted index (simply, n-gram Matching) has been widely used. A major reason is that it is scalable for large databases since it is not a main memory algorithm. Nevertheless, n-gram Matching also has drawbacks: the query performance tends to be bad, and many false positives occur if a large number of errors are allowed. In this paper, we propose an inverted index structure, which we call the n-gram/2L-Approximation index, that improves these drawbacks and an approximate string matching algorithm based on it. The n-gram/2L-Approximation is an adaptation of the n-gram/2L index [4], which the authors have proposed earlier for exact matching. Inheriting the advantages of the n-gram/2L index, the n-gram/2L-Approximation index reduces the size of the index and improves the query performance compared with the n-gram inverted index. In addition, the n-gram/2L-Approximation index reduces false positives compared with the n-gram inverted index if a large number of errors are allowed. We perform extensive experiments using the text and protein databases. Experimental results using databases of 1 GBytes show that the n-gram/2L-Approximation index reduces the index size by up to 1.8 times and, at the same time, improves the query performance by up to 4.2 times compared with those of the n-gram inverted index. [ABSTRACT FROM AUTHOR]
Published: 2007

26. n-Gram/2 L-approximation: a two-level n-gram inverted index structure for approximate string matching.

Author: Min-Soo Kim, Kyu-Young Whang, and Jae-Gil Lee
Subjects: DATABASES, ALGORITHMS, ERRORS, INVERTED indexes, APPROXIMATION theory, QUERY languages (Computer science)
Abstract: Approximate string matching is to find all the occurrences of a query string in a text database allowing a specified number of errors. Approximate string matching based on the n-gram inverted index (simply, n-gram Matching) has been widely used. A major reason is that it is scalable for large databases since it is not a main memory algorithm. Nevertheless, n-gram Matching also has drawbacks: the query performance tends to be bad, and many false positives occur if a large number of errors are allowed. In this paper, we propose an inverted index structure, which we call the n-gram/2L-Approximation index, that improves these drawbacks and an approximate string matching algorithm based on it. The n-gram/2L-Approximation is an adaptation of the n-gram/2L index [4], which the authors have proposed earlier for exact matching. Inheriting the advantages of the n-gram/2L index, the n-gram/2L-Approximation index reduces the size of the index and improves the query performance compared with the n-gram inverted index. In addition, the n-gram/2L-Approximation index reduces false positives compared with the n-gram inverted index if a large number of errors are allowed. We perform extensive experiments using the text and protein databases. Experimental results using databases of 1 GBytes show that the n-gram/2L-Approximation index reduces the index size by up to 1.8 times and, at the same time, improves the query performance by up to 4.2 times compared with those of the n-gram inverted index. [ABSTRACT FROM AUTHOR]
Published: 2007

27. Temporal Aggregation Using a Multidimensional Index.

Author: Joon-Ho Woo, Min-Jae Lee, Woong-Kee Loh, and Kyu-Young Whang
Subjects: MULTIDIMENSIONAL databases, DATABASES, INDEXES, DOCUMENTATION, BIBLIOGRAPHY, CONCORDANCES, INFORMATION science, COMPUTER science, SCIENCE, MATHEMATICAL mappings
Abstract: We present a new method for computing temporal aggregation that uses a multidimensional index. The novelty of our method lies in mapping the start time and end time of a temporal tuple to a data point in a two-dimensional space, which is stored in a two-dimensional index, and in calculating the temporal aggregates through a temporal join between the data in the index and the base intervals (defined as the intervals delimited by the start times or end times of the tuples). To enhance the performance, this method calculates the aggregates by incrementally modifying the aggregates from that of the previous base interval without re-reading all tuples for the current base interval. We have compared our method with the SB-tree, which is the state-of-the-art method for temporal aggregation. The results show that our method is an order of magnitude more efficient than the SB-tree method in an environment with frequent updates, while comparable in a read-only environment as the number of aggregates calculated in a query increases. [ABSTRACT FROM AUTHOR]
Published: 2007
Full Text: View/download PDF

28. Hippocratic XML databases: a model and an access control mechanism.

Author: Jae-Gil Lee, Kyu-Young Whang, Wook-Shin Han, and Il-Yeol Song
Subjects: XML (Extensible Markup Language), DATABASES, DOCUMENT markup languages, INTERNET programming, INFORMATION storage & retrieval systems
Abstract: The Hippocratic database model recently proposed by Agrawal et al. incorporates privacy protection capabilities into relational databases. Since the Hippocratic database is based on the relational database, it needs extensions to be adapted for XML databases. In this paper, we propose the Hippocratic XML database model, an extension of the Hippocratic database model for XML databases and present an efficient access control mechanism under this model. In contrast to relational data, XML data have tree-like hierarchies. Thus, in order to manage these hierarchies of XML data, we extend and formally define concepts presented in the Hippocratic database model. Next, we present a new mechanism, which we call the authorization index, that is used in the access control mechanism. This authorization index, which is implemented using a multi-dimensional index, allows us to efficiently search authorizations implied by the authorization granted on the nearest ancestor using the nearest neighbor search technique. Using synthetic and real data, we have performed extensive experiments comparing query processing lime with those of existing access control mechanisms. The results show that the proposed access control mechanism improves the wall clock time by up to 14 times over the top-down access control strategy and by up to 20 times over the bottom-up access control strategy. The major contributions of our paper are (1) extending the Hippocratic database model into the Hippocratic XML database model and (2) proposing an efficient access control mechanism that uses the authorization index and nearest neighbor search technique under this model. [ABSTRACT FROM AUTHOR]
Published: 2006

29. Transform-Space View: Performing Spatial Join in the Transform Space Using Original-Space Indexes.

Author: Min-Jae Lee, Kyu-Young Whang, Wook-Shin Han, and Il-Yeoi Song
Subjects: *SPATIAL analysis (Statistics), *MATHEMATICAL statistics, *ALGORITHMS, *MATHEMATICAL optimization, *MATHEMATICAL analysis, *SPATIAL systems, *STATISTICAL measurement, *ONLINE data processing, *STATISTICS
Abstract: Spatial joins find all pairs of objects that satisfy a given spatial relationship. In spatial joins using indexes, original-space indexes such as the R-tree are widely used. An original-space index is the one that indexes objects as represented in the original space. Since original-space indexes deal with extents of objects, it is relatively complex to optimize join algorithms using these indexes. On the other hand, transform-space indexes, which transform objects in the original space into points in the transform space and index them, deal only with points but no extents. Thus, optimization of join algorithms using these indexes can be relatively simple. However, the disadvantage of these join algorithms is that they cannot be applied to original-space indexes such as the A-tree. In this paper, we present a novel mechanism for achieving the best of these two types of algorithms. Specifically, we propose the new notion of the transform-space view and present the transform-space viewjoin algorithm. The transform-space view is a virtual transform-space index based on an original-space index. It allows us to "interpret" or "view" an existing original-space index as a transform-space index with no space and negligible time overhead and without actually modifying the structure of the original-space index or changing object representation. The transform-space view join algorithm joins two original-space indexes in the transform space through the notion of the transform-space view. Through analysis and experiments, we verity the excellence of the transform-space view join algorithm. The transform-space view join algorithm always outperforms existing ones for all the data sets tested in terms of all three measures used: the one-pass buffer size (the minimum buffer size required for guaranteeing one disk access per page), the number of disk accesses for a given buffer size, and the wall clock time. Thus, it constitutes a lower-bound algorithm. We believe that the proposed transform-space view can be applied to developing various new spatial query processing algorithms in the transform space. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

30. A Formal Framework for Prefetching Based on the Type-Level Access Pattern in Object-Relational DBMSs.

Author: Wook-Shin Han, Kyu-Young Whang, and Yang-Sae Moon
Subjects: *DATABASE management, *ELECTRONIC data processing, *RELATIONAL databases, *INFORMATION storage & retrieval systems, *XML (Extensible Markup Language), *COMPUTER science
Abstract: Prefetching is an effective method for minimizing the number of fetches between the client and the server in a database management system. In this paper, we formally define the notion of prefetching. We also formally propose new notions of the type-level access locality and type-level access pattern. The type-level access locality is a pheonomenon that repetitive patterns exist in the attributes referenced. The type-level access pattern is a pattern of attributes that are referenced in accessing the objects. We then develop an efficient capturing and prefetching policy based on this formal framework. Existing prefetching methods are based on object-level or page-level access patterns, which consist of object-ids or page-ids of the objects accessed. However, the drawback of these methods is that they work only when exactly the same objects or pages are accessed repeatedly. In contrast, even though the same objects are not accessed repeatedly, our technique effectively prefetches objects if the same attributes are referenced repeatedly, i.e., if there is type-level access locality. Many navigational applications in Object-Relational Database Management Systems (ORDBMSs) have type-level access locality. Therefore, our technique can be employed in ORDBMSs to effectively reduce the number of fetches, thereby significantly enhancing the performance. We also address issues in implementing the proposed algorithm. We have conducted extensive experiments in a prototype ORDBMS to show effectiveness of our algorithm. Experimental results using the OO7 benchmark, a real GIS application, and an XML application show that our technique reduces the number of fetches by orders of magnitude and improves the elapsed time by several factors over on-demand fetching and context-based prefetching, which is a state-of-the-art prefetching method. These results indicate that our approach provides a new paradigm in prefetching that improves performance of navigational applications significantly and is a practical method that can be implemented in commercial ORDBMSs. [ABSTRACT FROM AUTHOR]
Published: 2005
Full Text: View/download PDF

31. Transformation-based spatial partition join.

Author: Min-Jae Lee, Wook-Shin Han, and Kyu-Young Whang
Subjects: SPATIAL analysis (Statistics), ALGORITHMS, SPATIAL systems, SYSTEM analysis, COMPUTER systems
Abstract: Spatial joins find all pairs of spatial objects that satisfy a given spatial relationship. In this paper, we present the Transformation-Based Spatial Partition Join algorithm (TSPJ), a new spatial join algorithm that performs join in the transform space without using indexes. Since the existing algorithms deal with extents of spatial objects in the original space, they either need to replicate the spatial objects or have a relatively complex partition structure -- resulting in degrading performance. In contrast, the Transformation-Based Spatial Partition Join transforms objects in the original space into points in the transform space and deals only with points having no extents. The transformation does not incur any additional overhead. Thus, our algorithm has advantages over existing ones in that (1) it obviates the need for replicating spatial objects, and (2) its partition structure is simple. As a result, it always has better performance compared to existing algorithms. Extensive experiments show that the Transformation-Based Spatial Partition. Join improves performance by 19.4∼38.0% over the existing algorithms compared. [ABSTRACT FROM AUTHOR]
Published: 2004

32. Dynamic Buffer Allocation in Video-on-Demand Systems.

Author: Sang-Ho Lee, Kyu-Young Whang, Yang-Sae, Jayakrishna R., Wook-Shin Han, and II-Yeol Song, Jayakrishna R.
Subjects: *VIDEO on demand, *BUFFER storage (Computer science), *MULTIMEDIA systems, *COMPUTER memory management, *COMPUTER systems
Abstract: In video-on-demand (VOD) systems, as the size of the buffer allocated to user requests increases, initial latency and memory requirements increase. Hence, the buffer size must be minimized. The existing static buffer allocation scheme, however, determines the buffer size based on the assumption that the system is in the fully loaded state. Thus, when the system is in a partially loaded state, the scheme allocates a buffer larger than necessary to a user request. This paper proposes a dynamic buffer allocation scheme that allocates to user requests buffers of the minimum size in a partially loaded state, as well as in the fully loaded state. The inherent difficulty in determining the buffer size in the dynamic buffer allocation scheme is that the size of the buffer currently being allocated is dependent on the number of and the sizes of the buffers to be allocated in the next service period. We solve this problem by the predict-and-enforce strategy, where we predict the number and the sizes of future buffers based on inertia assumptions and enforce these assumptions at runtime. Any violation of these assumptions is resolved by deferring service to the violating new user request until the assumptions are satisfied. Since the size of the current buffer is dependent on the sizes of the future buffers, it is represented by a recurrence equation. We provide a solution to this equation, which can be computed at the system initialization time for runtime efficiency. We have performed extensive analysis and simulation. The results show that the dynamic buffer allocation scheme reduces initial latency (averaged over the number of user requests in service from one to the maximum capacity) to 1&frac;29.4 ∼ 1&frac;11.0 of that for the static one and, by reducing the memory requirement, increases the number of concurrent user requests to 2.36 ∼ 3.25 times that of the static one when averaged over the amount of system memory available. These results demonstrate that the dynamic buffer allocation scheme significantly improves the performance and capacity of VOD systems. [ABSTRACT FROM AUTHOR]
Published: 2003
Full Text: View/download PDF

33. Spatial join processing using corner transformation.

Author: Ju-Won Song and Kyu-Young Whang
Subjects: *DATABASE management, *ELECTRONIC data processing, *COMPUTER algorithms
Abstract: Proposes a spatial join algorithm using corner transformation in database systems. Relationships among spatial join windows; Evaluation of the viability of the algorithm; Comparison with R*-tree algorithm; Advantages of the proposed algorithm.
Published: 1999
Full Text: View/download PDF

34. Two-Dimensional Specification of Universal Quantification in a Graphical Database Query Language.

Author: Kyu-Young Whang, Maihotra, Ashok, Sockut, Gary H., Burns, Luanne, and Key-Sun Choi
Subjects: *DATABASES, *DATABASE management, *DATABASE searching, *INFORMATION storage & retrieval systems, *QUERY (Information retrieval system), *QUERY languages (Computer science)
Abstract: We propose a technique for specifying universal quantification and existential quantification (combined with negation) in a two-dimensional (graphical) database query language. Unlike other approaches that provide set operators to simulate universal quantification, this technique allows a direct representation of universal quantification. We present syntactic constructs for specifying universal and existential quantifications, two-dimensional translation of universal quantification to existential quantification (with negation), and translation of existentially quantified two-dimensional queries to relational queries. The resulting relational queries can be processed directly by many existing database systems. Traditionally, universal quantification has been considered a difficult concept for typical database programmers. We claim that this technique renders universal quantification easy to understand. To substantiate this claim, we provide a simple, easy-to-follow guideline for constructing universally quantified queries. We believe that the direct representation of universal quantification in a two-dimensional language is new and that our technique contributes significantly to the understanding of universal quantification in the context of database query languages. [ABSTRACT FROM AUTHOR]
Published: 1992
Full Text: View/download PDF

35. Separability —An Approach to Physical Database Design.

Author: Kyu-Young Whang, Wiederhold, and Sagalowicz
Published: 1984
Full Text: View/download PDF

36. Octree-R: an adaptive octree for efficient ray tracing.

Author: Kyu-Young Whang, Ju-Won Song, Ji-Woong Chang, Ji-Yun Kim, Wan-Sup Cho, Chong-Mok Park, and Il-Yeol Song
Published: 1995
Full Text: View/download PDF

37. DyBASe: A buffer allocation scheme for reducing average initial latency in video-on-demand systems.

Author: Sang-Ho Lee, Kyu-Young Whang, Yang-Sae Moon, and Il-Yeol Song
Subjects: *VIDEO on demand, *INTERACTIVE videos
Abstract: Presents a study which proposed a buffer allocation scheme for reducing average initial latency in video-on-demand (VOD) systems. Model of VOD systems; Static buffer allocation scheme; Conclusions.
Published: 2001
Full Text: View/download PDF

38. Tightly-coupled spatial database features in the Odysseus/OpenGIS DBMS for high-performance

Author: Min-Jae Lee, Ki-Hoon Lee, Kyu-Young Whang, Min-Soo Kim, Wook-Shin Han, Jae-Gil Lee, and Jun-Sung Kim
Subjects: Data processing, Geographic information system, Database, Interface (Java), Computer science, business.industry, Spatial database, Geography, Planning and Development, computer.software_genre, Data type, Concurrency control, Information system, business, computer, Spatial analysis, Information Systems
Abstract: Conventional object-relational database management system (ORDBMS) vendors provide extension mechanisms for adding user-defined types and functions to their own DBMSs. Here, the extension mechanisms are implemented using a high-level (typically, SQL-level) interface. We call this mechanism loose-coupling. The advantage of loose-coupling is that it is easy to implement. However, it is not preferable for implementing new data types and operations in large databases when high performance is required. We have earlier proposed the tight-coupling architecture (Whang et al. 2002, 2005) to satisfy this requirement. In tight-coupling, new data types and operations are integrated into the core of the DBMS engine in the extensible type layer. Thus, they are supported in a consistent manner with high performance. This tight-coupling architecture is being used to incorporate information retrieval features and spatial database features into the Odysseus ORDBMS that has been under development at KAIST/AITrc for 19 years. In this paper, we introduce the tightly-coupled spatial database features of Odysseus/OpenGIS. By taking advantage of tight-coupling, Odysseus/OpenGIS provides excellent performance in processing spatial queries as well as flexible concurrency control and recovery on spatial data. We show the performance through extensive experiments. Finally, we present sample applications of a geographical information system (GIS) implemented using Odysseus/OpenGIS.
Full Text: View/download PDF

39. Generalization of ZYT-linearizability for bilinear datalog programs

Author: Ki-Hyung Hong, Ji-Hoon Kang, Kyu-Young Whang, and Jung-Wan Cho
Subjects: Discrete mathematics, Linearizability, Generalization, Deductive database, Bilinear interpolation, Bilinear form, Type (model theory), Computer Science Applications, Theoretical Computer Science, Datalog, Automated theorem proving, Computational Theory and Mathematics, Deductive databases, Bilinear rules, Datalog programs, computer, Information Systems, Mathematics, computer.programming_language
Abstract: We propose a new type of linearizability, called right-linear-first (RLF) linearizability. The well-known ZYT-linearizability deals with only one bilinear rule. RLF-linearizability is a generalization of ZYT-linearizability since RLF-linearizability deals with general bilinear datalog programs consisting of multiple bilinear and linear rules. We identify sufficient conditions for RLF-linearizability. The test of the sufficient conditions is exponential in the size of the input datalog program, which is, however, usually very small compared with the size of the extensional database in deductive database applications.
Full Text: View/download PDF