4,806 results on '"Query processing"'
Search Results
2. A Comprehensive Energy Modeling Approach for Query Processing: Steps and Machine Learning Influence
- Author
-
Dembele, Simon Pierre, De Simone, Marco Claudio, Lorusso, Angelo, Santaniello, Domenico, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Yang, Xin-She, editor, Sherratt, R. Simon, editor, Dey, Nilanjan, editor, and Joshi, Amit, editor
- Published
- 2025
- Full Text
- View/download PDF
3. An Intelligent Optimized Compression Framework for Columnar Database.
- Author
-
Jadhawar, B. A. and Sharma, Narendra
- Subjects
- *
DATABASE management , *OLAP technology , *DATABASES , *ONLINE data processing - Abstract
Instead of storing data in rows, a columnar database is a type of Database Management System (DBMS). To speed up the processing and reply to a question, a columnar database's job is to efficiently write and read data to and from hard disc storage. One of the most crucial methods in the creation of column-oriented database systems is compression. For columns with Zero-length string types, all heavier and light-in-weight compression techniques have limitations. Processing of transactions online, these databases are substantially more effective for online analytical processing than for online transactional processing. This indicates that although they are made to examine transactions, they are not very effective at updating them. To overcome these issues a Zero Length Recurrent based Fruit Fly Optimization (ZLRFF) model is used. Additionally, a reduction technique is known as ZLRFF was designed to achieve a high compression ratio and allow straight lookups on compressed material without decompression first. ZLRFF's main goal is to divide a Zero-length string written column vertically into smaller columns that can each be compressed using a separate lightweight compression technique. To search directly on compressed data, we also provide a search technique we call FF-search. Extensive testing demonstrates that ZLRFF supports direct searching on compressed data in addition to achieving a decent compression ratio, which enhances query performance. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
4. An update-intensive LSM-based R-tree index.
- Author
-
Shin, Jaewoo, Zhou, Libin, Wang, Jianguo, and Aref, Walid G.
- Abstract
Many applications require update-intensive workloads on spatial objects, e.g., social-network services and shared-riding services that track moving objects. By buffering insert and delete operations in memory, the Log Structured Merge Tree (LSM) has been used widely in various systems because of its ability to handle write-heavy workloads. While the focus on LSM has been on key-value stores and their optimizations, there is a need to study how to efficiently support LSM-based secondary indexes (e.g., location-based indexes) as modern, heterogeneous data necessitates the use of secondary indexes. In this paper, we investigate the augmentation of a main-memory-based memo structure into an LSM secondary index structure to handle update-intensive workloads efficiently. We conduct this study in the context of an R-tree-based secondary index. In particular, we introduce the LSM RUM-tree that demonstrates the use of an Update Memo in an LSM-based R-tree to enhance the performance of the R-tree’s insert, delete, update, and search operations. The LSM RUM-tree introduces new strategies to control the size of the Update Memo to make sure it always fits in memory for high performance. The Update Memo is a light-weight in-memory structure that is suitable for handling update-intensive workloads without introducing significant overhead. Experimental results using real spatial data demonstrate that the LSM RUM-tree achieves up to 6.6x speedup on update operations and up to 249292x speedup on query processing over existing LSM R-tree implementations. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
5. Tps: A new way to find good vertex-search order for exact subgraph matching
- Author
-
Ma, Yixing, Xu, Baomin, and Yin, Hongfeng
- Subjects
Information and Computing Sciences ,Computer Vision and Multimedia Computation ,Exact subgraph matching ,Subgraph query ,Optimization ,Query processing ,Artificial Intelligence and Image Processing ,Computer Software ,Distributed Computing ,Information Systems ,Artificial Intelligence & Image Processing ,Software Engineering ,Electronics ,sensors and digital hardware ,Computer vision and multimedia computation ,Data management and data science ,Distributed computing and systems software - Published
- 2024
6. Hu-Fu: efficient and secure spatial queries over data federation: Hu-Fu: efficient and secure spatial queries over data...: Y. Tong, Y. Zeng, Y. Song, et al.
- Author
-
Tong, Yongxin, Zeng, Yuxiang, Song, Yang, Pan, Xuchen, Fan, Zeheng, Xue, Chunbo, Zhou, Zimu, Zhang, Xiaofei, Chen, Lei, Xu, Yi, Xu, Ke, and Lv, Weifeng
- Abstract
Data isolation has become an obstacle to scale up query processing over big data, since sharing raw data among data owners is often prohibitive due to security concerns. A promising solution is to perform secure queries over a federation of multiple data owners leveraging secure multi-party computation (SMC) techniques, as evidenced by recent federation studies on relational data. However, existing solutions are highly inefficient on spatial queries due to excessive secure distance operations for query processing and their usage of general-purpose SMC libraries for secure operation implementation. In this paper, we propose Hu-Fu, the first system for efficient and secure spatial query processing on a data federation. Hu-Fu seamlessly supports five mainstream spatial queries at scale, while ensuring both data and query privacy (i.e., sensitive spatial information of data owners and query users). The idea is to decompose the secure processing of a spatial query into as many plaintext operations and as few secure operations as possible, where fewer secure operators are involved and all of them are implemented dedicatedly. As a working system, Hu-Fu supports not only query input in native SQL, but also heterogeneous spatial databases (e.g., PostGIS, GeoMesa, and SpatialHadoop) at the backend. Extensive experiments show that Hu-Fu usually outperforms the state-of-the-arts in running time and communication cost while guaranteeing security. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
7. Design and development of advanced spatio temporal database models
- Author
-
Garima Jolly and Sunita Bhatti
- Subjects
index terms-avoidance ,disaster management ,path planning ,query processing ,spatial database ,visibility graph ,Science - Abstract
The researchers have presented a spatial database model for geometric path planning suitable for facilitating disaster management activities. The spatial queries executed using the proposed approach can reduce the computational time needed to find an optimal collision free path for network analysis. The framework is applicable to 2-dimensional and 3-dimensional workspaces. The strategy used decouples the motion planning problem into small tractable problems, which are solved using know path planning algorithm.
- Published
- 2024
- Full Text
- View/download PDF
8. ECEQ: efficient multi-source contact event query processing for moving objects.
- Author
-
Li, Pengyue, Dai, Hua, Zhou, Qian, Chen, Yu, Zhou, Qiang, Li, Bohan, and Yang, Geng
- Abstract
Using trajectory data to query and analyze contact events is an emerging method for disease prevention and control. Existing contact event query processings only focus on single-source contact events (one-to-one), overlooking the more realistic multi-source scenarios (n-to-one). In this paper, we focus on the multi-source contact event query problem. We first define multi-source contact events, followed by introducing a baseline query processing based on the idea of sliding window scanning. To improve the query efficiency, an anchor time point-based optimization strategy and a grid-time-object inverted index-based optimization strategy are designed and used in the optimized query processing, which can effectively filter out moving objects and time points that have no contact events. Experimental results using two real-world datasets demonstrate that the proposed schemes can effectively identify potential contact events while maintaining lower query time costs compared to existing schemes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. A Novel Approach for Parallel Document Clustering Using an Enhanced Parallel WAND Algorithm.
- Author
-
Ali, Wael, Khamis, Soheir, and Zakaria, Wael
- Subjects
SEARCH algorithms ,DATA structures ,INFORMATION retrieval ,PARALLEL algorithms ,SEARCH engines ,DOCUMENT clustering - Abstract
Document clustering is crucial for managing a large textual data available on the Internet, even though it is computationally costly to cluster high dimensional and large datasets. To tackle these obstacles, a widely used information retrieval method called the weighted AND (WAND) algorithm is utilized as an essential stage in a document clustering process to make it more effective. WAND uses an efficient data structure known as an inverted index to determine document scores and ranks, allowing it to extract the topK documents that are most similar to a given query. Owing to its effectiveness, several versions of parallel algorithms have been proposed to enhance it. However, challenges in document clustering increase since it requires retrieving a higher number of topK and processing longer queries. So, in this paper, an enhanced parallel version of the WAND algorithm (PWAND) is proposed. PWAND divides the inverted index into partitions, each is assigned a specific percentage of topK according to its relevance to the given query. Furthermore, a novel PWAND-based Parallel PArtitional Clustering (PWPPAC) approach that combines the parallel execution of clustering stages with PWAND is proposed. Based on the practical results across a variety of datasets, PWAND is a promising method since it produces results that are extremely match to those obtained by WAND, but with a significant speedup, where the maximum recorded speedup is 85.7x on AG-News dataset. Moreover, the results show that employing the PWAND algorithm in the clustering process makes it more efficient, while maintaining accuracy and quality of clustering. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. FluiDB : adaptive storage layout using reversible relational operators
- Author
-
Perivolaropoulos, Christos, Viglas, Stratis, and Grot, Boris
- Subjects
relational database ,efficiency ,materialized views ,FluiDB ,relational algebra semantics ,query processing ,garbage collector - Abstract
It is a popular practice to use materialized intermediate results to improve the perfor- mance of RDBMSes. Work in this area has focused either optimisers matching existing materialized results in the cache and selecting intermediate results from a plan to survive the plan execution. To our knowledge, few attempts have been made to create plans with cached intermediate results in mind, and none that make any attempt to deduplicate the stored data to alleviate the storage cost of maintaining possibly large queries. We built FluiDB to explore a novel approach to integrating the selection of materialized results with the planner to optimize the logical representation of data in memory. FluiDB materializes common intermediate results and deduplucates data to alleviate the cost of maintaining them. This is achieved by introducing reversible operations: versions of normal relational operators that may produce complementary tables alongside the normal output, that allow the reconstruction of the input relations. A planner aware of such operations can build query plans that dynamically adapt the data layout to the plan under a constrained memory budget. This thesis revolves around four main chapters, each of which describes in detail a different part of FluiDB and a final one that goes into evaluation of the system. The first chapter focuses on query processing and the relational algebra semantics that FluiDB operates under. FluiDB parses queries into DAGs of sub-queries connected by reversible operators. Each such graph of the workload is merged into a global query graph that is used to infer properties of each relation like cardinality and extent. The next chapter is dedicated to the planner and a novel monad for weighted backtracking that the planer is based on. The planner attempts to generate a plan by traversing the query graph so that, besides solving the query at hand, it leaves in memory a curated set of queries aiming to maximize the amortized performance of the workload being run. In this chapter, the garbage collector is also discussed, which is the part of the planner responsible for inserting plan fragments that delete nodes when required such that the available storage budget is respected while no information is lost from the database. After that, we go into Antisthenis, a framework we built for defining incremental com- putation systems. Antisthenis is used to build modules of the planner that efficiently determine whether a relation materializable, and estimate the cost of materializing a relation, given a set of materialized relations. Besides computation reuse, Antisthenis is able to prune the computation taking advantage of properties of the operators involved like absorbing group elements and bounded partial results. These techniques are also used to allow evaluation of some classes of self-referential computations. The final chapter about the FluiDB architecture describes the transpilation of plans gen- erated by the planner to C++, as well as the supporting libraries that enable the tran- spilation of queries to highly specialized C++ code, and the low level data organization of the database. The thesis closes with a chapter that describes our methods for benchmarking and some experimental results.
- Published
- 2023
- Full Text
- View/download PDF
11. 基于动态融合索引树的ARXML查询处理算法.
- Author
-
戴深龙, 田镇虎, 李超超, 徐封杰, and 方菱
- Abstract
Copyright of Journal of Computer Engineering & Applications is the property of Beijing Journal of Computer Engineering & Applications Journal Co Ltd. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
12. General-purpose query processing on summary graphs.
- Author
-
Anagnostopoulos, Aris, Arrigoni, Valentina, Gullo, Francesco, Salvatori, Giorgia, and Severini, Lorenzo
- Abstract
Graph summarization is a well-established problem in large-scale graph data management. Its goal is to produce a summary graph, which is a coarse-grained version of a graph, whose use in substitution for the original graph enables downstream task execution and query processing at scale. Despite the extensive literature on graph summarization, still nowadays query processing on summary graphs is accomplished by either reconstructing the original graph, or in a query-specific manner. No general methods exist that operate on the summary graph only, with no graph reconstruction. In this paper, we fill this gap, and study for the first time general-purpose (approximate) query processing on summary graphs. This is a new important tool to support data-management tasks that rely on scalable graph query processing, including social network analysis. We set the stage of this problem, by devising basic, yet principled algorithms, and thoroughly analyzing their peculiarities and capabilities of performing well in practice, both conceptually and experimentally. The ultimate goal of this work is to make researchers and practitioners aware of this so-far overlooked problem, and define an authoritative starting point to stimulate and drive further research. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
13. Semantic Data Integration and Querying: A Survey and Challenges.
- Author
-
Masmoudi, Maroua, Ben Abdallah Ben Lamine, Sana, Karray, Mohamed Hedi, Archimede, Bernard, and Baazaoui Zghal, Hajer
- Subjects
- *
DATABASES , *ELECTRONIC data processing , *GLOBAL value chains , *LINKED data (Semantic Web) , *MANAGEMENT information systems , *ONTOLOGIES (Information retrieval) , *RDF (Document markup language) - Published
- 2024
- Full Text
- View/download PDF
14. Indexing spatiotemporal trajectory data streams on key-value storage.
- Author
-
Zhao, Xiaofei, Lam, Kam-Yiu, and Kuo, Tei-Wei
- Subjects
- *
GPS receivers , *STORE location , *INFORMATION organization - Abstract
In a trajectory management system, moving objects are typically equipped with GPS devices to report their locations to a data store. Given the potentially high frequency of location updates by multiple moving objects, these data stores often operate under write-intensive conditions. Existing trajectory indexing methods, such as XZ-ordering, which are geared towards static trajectory data, may experience considerable latency under such demanding workloads. In response, this paper introduces spatio-temporal index structures that can be constructed with low latency. Trajectories are categorized into two types for storage: 'live' and 'static'. Live trajectories are indexed by the Dual-Key Encoding (DKE) scheme, where each data point is represented by two key-value entries, facilitating both ID-temporal and spatial queries. Static trajectories, on the other hand, offer a more compact storage solution, reducing the overhead associated with live trajectories. Upon the completion of a trip, live trajectory data is transformed into a static trajectory entry through a compaction process. To augment the efficiency of spatial index construction for static trajectories, we introduce a new encoding scheme, XS2, coupled with an adaptive segmentation policy, AdaptSeg, to optimize trajectory segmentation, thereby enhancing index building and query processing efficiency. The indexing methods are demonstrated atop of LevelDB, an LSM-based key-value storage library, resulting the proposed LevelDBST system. Performance evaluations conducted using both synthetic and real-world datasets reveal that LevelDBST is capable of constructing spatial indexes for continuously updating trajectories with reduced latency, in comparison to traditional XZ-ordering methods. This efficiency is achieved while maintaining an acceptable balance in data accessing time and storage costs. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. Object-based image retrieval and detection for surveillance video.
- Author
-
Jagtap, Swati and Chopade, Nilkanth B.
- Subjects
OBJECT recognition (Computer vision) ,GRAPHICAL user interfaces ,VIDEO surveillance ,TECHNOLOGICAL innovations ,IMAGE retrieval ,HUMAN error - Abstract
With technological advancement worldwide, the video surveillance market is growing drastically in a versatile field. Monitoring, browsing, and retrieving a specific object in a long video becomes difficult due to the enormous amount of data produced by the surveillance camera. With limitations on human resources and browsing time, there is a need for a new video analytics model to handle more complex tasks, such as object detection and query retrieval. The current approach involves techniques like unsupervised segmentation, multiscale segmentation, and feature-based descriptions. However, these methods often encounter extensive space and time consumption challenges. A solution has been developed for retrieving targeted objects from surveillance videos via user queries, employing a graphical interface for input. Extracting relevant frames based on userentered text queries is enabled through using YOLOv8 for object detection. Users interact through a graphical user interface deployed on a Jetson Xavier Development board. The system's outcome is a time-efficient and highly accurate automated model for object detection and query retrieval, eliminating human errors associated with manually locating objects in videos upon user queries. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Improving query processing in blockchain systems by using a multi-level sharding mechanism.
- Author
-
Matani, Alemeh, Sahafi, Amir, and Broumandnia, Ali
- Subjects
- *
DATA structures , *BLOCKCHAINS , *RELATIONAL databases - Abstract
With the distributed and decentralized nature of blockchain, and with its sequential data access, query processing emerges as a challenging issue in the blockchain systems. These features hinder efficient query processing and make it difficult to guarantee the validity and privacy-preserving of query results. Several solutions have been proposed to tackle the efficiency, reliability, and privacy challenges of query processing in blockchain systems. There has been rarely a comprehensive solution addressing all of these issues. In addition, the existing solutions often assume that the blockchain nodes are homogeneous in terms of their capabilities and available resources, while the blockchain nodes can have heterogeneous computational, communication, and storage resources, and can also contribute to the blockchain network in different manners. This work, considering the heterogeneity of network nodes, introduces a multi-level and score-based sharding solution for query processing where the nodes are organized into a hierarchical tree-like structure based on their score and store a proportion of transaction data in a DAG-based data structure resulting in an efficient query time. Additionally, the nodes reach a consensus over the query results from the bottom to the top of the hierarchical structure enabling reliable and fast query processing. The experiments conducted during the evaluation show that the efficiency of the proposed work is near that of relational databases in terms of query response time. It also provides a high validity rate taking advantage of its hierarchical consensus mechanism and preserves the privacy of query results using a delegation-based integration method where the final query result is integrated by the client's representative. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. Efficient algorithms for community aware ridesharing.
- Author
-
Nabila, Shuha, Hashem, Tanzima, Anwar, Samiul, and Islam, A. B. M. Alim Al
- Subjects
- *
RIDESHARING , *RIDESHARING services , *TRAFFIC congestion , *URBAN pollution , *POLLUTION - Abstract
Ridesharing services have been becoming a prominent solution to reduce road traffic congestion and environmental pollution in urban areas. Existing ridesharing services fall apart in ensuring the social comfort of the riders. We formulate a Community aware Ridesharing Group Set (CaRGS) query that satisfies the spatial and social constraints of the riders and finds a set of ridesharing groups with the maximum number of served riders. The CaRGS query utilizes user social data in community levels to ensure user privacy. We show that the problem of finding CaRGS query answer is NP-Hard and propose two heuristic approaches: a hierarchical approach and an iterative approach to evaluate CaRGS queries. We evaluate the effectiveness, efficiency, and accuracy of our solution through extensive experiments using real datasets and present a comparative analysis among the proposed algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. Sharing Queries with Nonequivalent User-defined Aggregate Functions.
- Author
-
Zhang, Chao and Farouk, Toumani
- Subjects
- *
SQL , *PYTHON programming language , *C++ , *AGGREGATION operators , *DATABASES - Published
- 2024
- Full Text
- View/download PDF
19. DSTree: A Spatio-Temporal Indexing Data Structure for Distributed Networks.
- Author
-
Hojati, Majid, Roberts, Steven, and Robertson, Colin
- Subjects
DATA structures ,PEER-to-peer architecture (Computer networks) ,BLOCKCHAINS ,DATA warehousing ,PARALLEL processing ,DATA management ,INFORMATION sharing - Abstract
The widespread availability of tools to collect and share spatial data enables us to produce a large amount of geographic information on a daily basis. This enormous production of spatial data requires scalable data management systems. Geospatial architectures have changed from clusters to cloud architectures and more parallel and distributed processing platforms to be able to tackle these challenges. Peer-to-peer (P2P) systems as a backbone of distributed systems have been established in several application areas such as web3, blockchains, and crypto-currencies. Unlike centralized systems, data storage in P2P networks is distributed across network nodes, providing scalability and no single point of failure. However, managing and processing queries on these networks has always been challenging. In this work, we propose a spatio-temporal indexing data structure, DSTree. DSTree does not require additional Distributed Hash Trees (DHTs) to perform multi-dimensional range queries. Inserting a piece of new geographic information updates only a portion of the tree structure and does not impact the entire graph of the data. For example, for time-series data, such as storing sensor data, the DSTree performs around 40% faster in spatio-temporal queries for small and medium datasets. Despite the advantages of our proposed framework, challenges such as 20% slower insertion speed or semantic query capabilities remain. We conclude that more significant research effort from GIScience and related fields in developing decentralized applications is needed. The need for the standardization of different geographic information when sharing data on the IPFS network is one of the requirements. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
20. Employing Generative Artificial Intelligence in Replacement of Traditional Backend Systems
- Author
-
Moroz, Artur, Solohubov, Illia, Oliinyk, Andrii, Subbotin, Sergey, Skrupsky, Stepan, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Szewczyk, Roman, editor, Zieliński, Cezary, editor, Kaliczyńska, Małgorzata, editor, and Bučinskas, Vytautas, editor
- Published
- 2024
- Full Text
- View/download PDF
21. On-The-Fly Data Distribution to Accelerate Query Processing in Heterogeneous Memory Systems
- Author
-
Berthold, André, Schmidt, Lennart, Obersteiner, Anton, Habich, Dirk, Lehner, Wolfgang, Schirmeier, Horst, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Tekli, Joe, editor, Gamper, Johann, editor, Chbeir, Richard, editor, and Manolopoulos, Yannis, editor
- Published
- 2024
- Full Text
- View/download PDF
22. Efficient Random Sampling from Very Large Databases
- Author
-
Cohen, Idan, Yehezkel, Aviv, Yakhini, Zohar, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Strauss, Christine, editor, Amagasa, Toshiyuki, editor, Manco, Giuseppe, editor, Kotsis, Gabriele, editor, Tjoa, A Min, editor, and Khalil, Ismail, editor
- Published
- 2024
- Full Text
- View/download PDF
23. Enhancing Query Processing in Big Data: Scalability and Performance Optimization
- Author
-
Sahaya Sheela, M., Farhaoui, Yousef, Kanmani Pappa, C., Ashokkumar, N., Aljanabi, Mohammad, Rocha, Álvaro, Series Editor, Hameurlain, Abdelkader, Editorial Board Member, Idri, Ali, Editorial Board Member, Vaseashta, Ashok, Editorial Board Member, Dubey, Ashwani Kumar, Editorial Board Member, Montenegro, Carlos, Editorial Board Member, Laporte, Claude, Editorial Board Member, Moreira, Fernando, Editorial Board Member, Peñalvo, Francisco, Editorial Board Member, Dzemyda, Gintautas, Editorial Board Member, Mejia-Miranda, Jezreel, Editorial Board Member, Hall, Jon, Editorial Board Member, Piattini, Mário, Editorial Board Member, Holanda, Maristela, Editorial Board Member, Tang, Mincong, Editorial Board Member, Ivanovíc, Mirjana, Editorial Board Member, Muñoz, Mirna, Editorial Board Member, Kanth, Rajeev, Editorial Board Member, Anwar, Sajid, Editorial Board Member, Herawan, Tutut, Editorial Board Member, Colla, Valentina, Editorial Board Member, Devedzic, Vladan, Editorial Board Member, and Farhaoui, Yousef, editor
- Published
- 2024
- Full Text
- View/download PDF
24. Smart Learning Applications: Leveraging LLMs for Contextualized and Ethical Educational Technology
- Author
-
Alier, Marc, Casañ, María José, Filvà, Daniel Amo, Huang, Ronghuai, Series Editor, Kinshuk, Series Editor, Jemni, Mohamed, Series Editor, Chen, Nian-Shing, Series Editor, Spector, J. Michael, Series Editor, Gonçalves, José Alexandre de Carvalho, editor, Lima, José Luís Sousa de Magalhães, editor, Coelho, João Paulo, editor, García-Peñalvo, Francisco José, editor, and García-Holgado, Alicia, editor
- Published
- 2024
- Full Text
- View/download PDF
25. CSQF-BA: Efficient Container Query Technology for Cloud Security Query Framework with Bat Algorithm
- Author
-
Hsieh, Chao-Hsien, Xu, Fengya, Kong, Dehong, Yang, Qingqing, Ma, Yue, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Huang, De-Shuang, editor, Chen, Wei, editor, and Guo, Jiayang, editor
- Published
- 2024
- Full Text
- View/download PDF
26. Finding a Second Wind: Speeding Up Graph Traversal Queries in RDBMSs Using Column-Oriented Processing
- Author
-
Firsov, Mikhail, Polyntsov, Michael, Smirnov, Kirill, Chernishev, George, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Mosbah, Mohamed, editor, Kechadi, Tahar, editor, Bellatreche, Ladjel, editor, and Gargouri, Faiez, editor
- Published
- 2024
- Full Text
- View/download PDF
27. Investigating Learning Join Order Optimization Strategies for Rule-based Data Engines
- Author
-
Karvelas, Antonios, Foufoulas, Yannis, Simitsis, Alkis, and Ioannidis, Yannis
- Published
- 2024
- Full Text
- View/download PDF
28. A Unique Query Processing Framework using Lexical-Cepstral Feature Extraction based B2DT Classifier in Natural Language Processing.
- Author
-
Kolarkar, Ashlesha and Kumar, Sandeep
- Subjects
NATURAL language processing ,MACHINE learning ,FEATURE extraction ,DEEP learning ,DATABASES ,SYSTEMS design ,SQL - Abstract
The amount of data produced today is constantly increasing. With the advent of contemporary database tools and rising technology, we can store a lot of data. But, the problem is that a lot of people need to grow more adapted to the user interfaces and technological advancements that process data and show it in the manner desired by the user. It implies that a huge number of individuals require additional database management expertise. Thus, this work intends to implement a technology that will help users get precise data from databases without prior knowledge by converting natural language questions into SQL queries. In the existing works, several query processing frameworks are developed for retrieving data from the large database using advanced machine learning and deep learning techniques. But, the problems behind those works are computational burden, increased time consumption, high loss, lack of reliability, and accuracy. Therefore, the proposed work motivates to develop a simple as well as advanced feature extraction and machine learning models for an effective query processing. Here, the dictionary database (i.e. Text & Audio based SQL query formation) is created for system design and implementation. The proposed framework can handle both text and audio input data by extracting the features with the use of Lexical Text Data Analyzer (LTDA) and Mel-Frequency Cepstral Coefficients (MFCC) techniques respectively. Then, the extracted features are trained with the use of Bagged Bayesian Decision Tree (B2DT) classifier for an accurate query recognition. Finally, the performance of the proposed LTDA-MFCC integrated with B2DT model is validated and tested using several parameters. As we show, our model is able to improve an accuracy to 93.9% on the SQL query questions. In more detail, the accuracy, precision, recall, and F-score of instruction parsing reach 93.9%, 93.5%, 93.2%, and 93.4% respectively. Especially, it provides more convenient interactive means for accessing the databases through natural language statements in English language. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
29. Label-constrained shortest path query processing on road networks.
- Author
-
Zhang, Junhua, Yuan, Long, Li, Wentao, Qin, Lu, Zhang, Ying, and Zhang, Wenjie
- Abstract
Computing the shortest path between two vertices is a fundamental problem in road networks. Most of the existing works assume that the edges in the road networks have no labels, but in many real applications, the edges have labels and label constraints may be placed on the edges appearing on a valid shortest path. Hence, we study the label-constrained shortest path queries in this paper. In order to process such queries efficiently, we adopt an index-based approach and propose a novel index structure, LSD - Index , based on tree decomposition. With LSD - Index , we design efficient query processing and index construction algorithms with good performance guarantees. Moreover, due to the dynamic properties of real-world networks, we also devise index maintenance algorithms that can maintain the index efficiently. To evaluate the performance of proposed methods, we conduct extensive experimental studies using large real road networks including the whole USA road network. Compared with the state-of-the-art approach, the experimental results demonstrate that our algorithm not only achieves up to two orders of magnitude speedup in query processing time but also consumes much less index space. Meanwhile, the experimental results also show that the index can also be efficiently constructed and maintained for dynamic graphs. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
30. Secure query processing and optimization in cloud environment: a review.
- Author
-
VL, Divya, PA, Job, and Mathew K, Preetha
- Subjects
- *
PROCESS capability , *CLOUD computing , *TRUST - Abstract
Cloud computing is trusted and used by millions of users around the world. Therefore, it becomes essential to offer trusted and efficient services to its users. Query processing and query optimization are the major components to provide optimal query plans for users through different steps. So far, the mathematical evaluation including algebraic transformation and selectivity estimation models are performed to improve the security while optimization and processing of query. Meanwhile, they have several equivalence rules to have various logical form. That problem was stated by single query processing at a time, but they are not efficient for dynamic operation of cloud since they are slower. Hence, this paper presents a review of improvement made to the processing and optimization of queries with the security enhancement in the past 8 years neglecting the tedious mathematical procedures. Along with the query processing techniques, the quantitative analysis of those techniques is provided in the paper based on deterministic and non-deterministic query processing techniques. As a result, it is found that when utilizing multiple query processing techniques, the host system may get loaded and struck, while the adaptive model works better by providing the process to host system based on its processing capability. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
31. Answering reachability queries with ordered label constraints over labeled graphs.
- Author
-
He, Daoliang, Yuan, Pingpeng, and Jin, Hai
- Abstract
Reachability query plays a vital role in many graph analysis tasks. Previous researches proposed many methods to efficiently answer reachability queries between vertex pairs. Since many real graphs are labeled graph, it highly demands Label-Constrained Reachability (LCR) query in which constraint includes a set of labels besides vertex pairs. Recent researches proposed several methods for answering some LCR queries which require appearance of some labels specified in constraints in the path. Besides that constraint may be a label set, query constraint may be ordered labels, namely OLCR (Ordered-Label-Constrained Reachability) queries which retrieve paths matching a sequence of labels. Currently, no solutions are available for OLCR. Here, we propose DHL, a novel bloom filter based indexing technique for answering OLCR queries. DHL can be used to check reachability between vertex pairs. If the answers are not no, then constrained DFS is performed. So, we employ DHL followed by performing constrained DFS to answer OLCR queries. We show that DHL has a bounded false positive rate, and it's powerful in saving indexing time and space. Extensive experiments on 10 real-life graphs and 12 synthetic graphs demonstrate that DHL achieves about 4.8–22.5 times smaller index space and 4.6–114 times less index construction time than two state-of-art techniques for LCR queries, while achieving comparable query response time. The results also show that our algorithm can answer OLCR queries effectively. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. An Efficient and Scalable FHE-Based PDQ Scheme: Utilizing FFT to Design a Low Multiplication Depth Large-Integer Comparison Algorithm.
- Author
-
Zhang, Fahong, Yang, Chen, Zong, Rui, Zheng, Xinran, Wang, Jianfei, and Meng, Yishuo
- Abstract
The growing number of data privacy breaches and associated financial losses have driven the demand for private database queries. Clients typically submit queries that involve both search and computation operations, such as counting students under a certain age or calculating the BMI of employees above a specific age. Existing protocols often face limitations due to reliance on specific-purpose encryption schemes or multiple communication rounds between clients and servers. In this work, we present a unified framework utilizing fully homomorphic encryption techniques to efficiently and privately process queries with search and computation operations. Our contributions include a homomorphic encryption-based private comparison algorithm, called the layered comparison algorithm, which achieves a 2.6-6.6X performance improvement compared to algorithms from prior work; a fast Fourier transform-based preprocessing method enabling accurate large integer arithmetic operations in the encrypted domain; and a scalable database encoding method. Evaluation results demonstrate the practicality of our system, as it processes an aggregated query for a 1k-row encrypted database in approximately 4.53 seconds. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. HINT: a hierarchical interval index for Allen relationships.
- Author
-
Christodoulou, George, Bouros, Panagiotis, and Mamoulis, Nikos
- Abstract
Indexing intervals is a fundamental problem, finding a wide range of applications, most notably in temporal and uncertain databases. We propose HINT, a novel and efficient in-memory index for range selection queries over interval collections. HINT applies a hierarchical partitioning approach, which assigns each interval to at most two partitions per level and has controlled space requirements. We reduce the information stored at each partition to the absolutely necessary by dividing the intervals in it, based on whether they begin inside or before the partition boundaries. In addition, our index includes storage optimization techniques for the effective handling of data sparsity and skewness. We show how HINT can be used to efficiently process queries based on Allen's relationships. Experiments on real and synthetic interval sets of different characteristics show that HINT is typically one order of magnitude faster than existing interval indexing methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
34. Multimodal Data Modeling Technology and Its Application for Cloud-edge-device Collaboration.
- Author
-
Shuangshuang Cui, Xian Wu, Hongzhi Wang, and Hao Wu
- Subjects
DATA modeling ,DATA management - Abstract
In the cloud-edge-device collaboration architecture, data types are diverse, and there are differences in storage resources and computing resources at all levels, which brings new challenges to data management. The existing data models or simple superposition of data models are difficult to meet the requirements of multimodal data management and collaborative management in the cloud-edge-device. Therefore, research on multimodal data modeling technology for cloud-edge-device collaboration has become an important issue. The core is how to efficiently obtain the query results that meet the needs of the application from the cloud-edge-device architecture. Starting from the data types of the three-layer data of cloud-edge-device, in this paper we propose a multimodal data modeling technology for cloudedge-device collaboration, give the definition of multimodal data model based on tuples, and design six base classes to achieve a unified representation of multimodal data. The basic data operation architecture of cloud-edge-device collaborative query is also proposed to meet the query requirements of cloud-edge-device business scenarios. The integrity constraints of the multimodal data model are given, which lays a theoretical foundation for query optimization. Finally, a demonstration application of the multimodal data model for cloud-edge-device collaboration is given, and the proposed data model storage method is verified from the three aspects of data storage time, storage space, and query time. The experimental results show that the proposed scheme can effectively represent the multimodal data in the cloud-edge-device collaboration architecture. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
35. Integration of Knowledge Bases and External Information Sources via Magic Properties and Query-Driven Entity Linking
- Author
-
Ohmori, Yuuki, Kitagawa, Hiroyuki, Amagasa, Toshiyuki, Matono, Akiyoshi, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Delir Haghighi, Pari, editor, Pardede, Eric, editor, Dobbie, Gillian, editor, Yogarajan, Vithya, editor, ER, Ngurah Agus Sanjaya, editor, Kotsis, Gabriele, editor, and Khalil, Ismail, editor
- Published
- 2023
- Full Text
- View/download PDF
36. Clock-G: Temporal Graph Management System
- Author
-
Massri, Maria, Miklos, Zoltan, Raipin, Philippe, Meye, Pierre, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Hameurlain, Abdelkader, editor, Tjoa, A Min, editor, Boucelma, Omar, editor, and Toumani, Farouk, editor
- Published
- 2023
- Full Text
- View/download PDF
37. NoGar: A Non-cooperative Game for Thread Pinning in Array Databases
- Author
-
Dominico, Simone, Alves, Marco A. Z., de Almeida, Eduardo C., Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Strauss, Christine, editor, Amagasa, Toshiyuki, editor, Kotsis, Gabriele, editor, Tjoa, A Min, editor, and Khalil, Ismail, editor
- Published
- 2023
- Full Text
- View/download PDF
38. Utilizing Data from Quick Access Recorder to Predict Faulty Processing on Aircrafts
- Author
-
Jalajakshi, V., Myna, N., Xhafa, Fatos, Series Editor, Rajakumar, G., editor, Du, Ke-Lin, editor, and Rocha, Álvaro, editor
- Published
- 2023
- Full Text
- View/download PDF
39. Where a Little Change Makes a Big Difference: A Preliminary Exploration of Children’s Queries
- Author
-
Pera, Maria Soledad, Murgia, Emiliana, Landoni, Monica, Huibers, Theo, Aliannejadi, Mohammad, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Kamps, Jaap, editor, Goeuriot, Lorraine, editor, Crestani, Fabio, editor, Maistro, Maria, editor, Joho, Hideo, editor, Davis, Brian, editor, Gurrin, Cathal, editor, Kruschwitz, Udo, editor, and Caputo, Annalina, editor
- Published
- 2023
- Full Text
- View/download PDF
40. Fuzzy Semantic Query Mapping and Processing
- Author
-
Chakhar, Salem, Brahmia, Zouhaier, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Farhaoui, Yousef, editor, Rocha, Alvaro, editor, Brahmia, Zouhaier, editor, and Bhushab, Bharat, editor
- Published
- 2023
- Full Text
- View/download PDF
41. Practical planning and execution of groupjoin and nested aggregates.
- Author
-
Fent, Philipp, Birler, Altan, and Neumann, Thomas
- Abstract
Groupjoins combine execution of a join and a subsequent group-by. They are common in analytical queries and occur in about of the queries in TPC-H and TPC-DS. While they were originally invented to improve performance, efficient parallel execution of groupjoins can be limited by contention in many-core systems. Efficient implementations of groupjoins are highly desirable, as groupjoins are not only used to fuse group-by and join, but are also useful to efficiently execute nested aggregates. For these, the query optimizer needs to reason over the result of aggregation to optimally schedule it. Traditional systems quickly reach their limits of selectivity and cardinality estimations over computed columns and often treat group-by as an optimization barrier. In this paper, we present techniques to efficiently estimate, plan, and execute groupjoins and nested aggregates. We propose four novel techniques, aggregate estimates to predict the result distributions of aggregates, parallel groupjoin execution for scalable execution of groupjoins, index groupjoins, and a greedy eager aggregation optimization technique that introduces nested preaggregations to significantly improve execution plans. The resulting system has improved estimates, better execution plans, and a contention-free evaluation of groupjoins, which speeds up TPC-H and TPC-DS queries significantly. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
42. SGPAC: generalized scalable spatial GroupBy aggregations over complex polygons.
- Author
-
Abdelhafeez, Laila, Magdy, Amr, and Tsotras, Vassilis J.
- Subjects
- *
POSTAL codes , *CITIES & towns , *POLYGONS , *POINT set theory - Abstract
This paper studies the spatial group-by query over complex polygons. Given a set of spatial points and a set of polygons, the spatial group-by query returns the number of points that lie within the boundaries of each polygon. Groups are selected from a set of non-overlapping complex polygons, typically in the order of thousands, while the input is a large-scale dataset that contains hundreds of millions or even billions of spatial points. This problem is challenging because real polygons (like counties, cities, postal codes, voting regions, etc.) are described by very complex boundaries. We propose a highly-parallelized query processing framework to efficiently compute the spatial group-by query on highly skewed spatial data. We also propose an effective query optimizer that adaptively assigns the appropriate processing scheme based on the query polygons. Our experimental evaluation with real data and queries has shown significant superiority over all existing techniques. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
43. Efficient Document-at-a-time and Score-at-a-time Query Evaluation for Learned Sparse Representations.
- Author
-
MACKENZIE, JOEL, TROTMAN, ANDREW, and LIN, JIMMY
- Abstract
The article focuses on the examination of learned sparse representations generated by transformers in the context of document-at-a-time (DaaT) and score-at-a-time (SaaT) query evaluation for ranking models. The topics covered include the impact of term weights on query performance, a comparison of DaaT and SaaT approaches in this context, and the introduction of optimizations to enhance retrieval efficiency while maintaining effectiveness.
- Published
- 2023
- Full Text
- View/download PDF
44. Persistent graph stream summarization for real-time graph analytics.
- Author
-
Jia, Yan, Gu, Zhaoquan, Jiang, Zhihao, Gao, Cuiyun, and Yang, Jianye
- Subjects
- *
VECTOR spaces , *EMPIRICAL research - Abstract
In massive and rapid graph streams, a useful and important task is to summarize the structure of graph streams in order to enable efficient and effective graph query processing. Although this task has been extensively studied in the literature, we observe that the existing solutions for graph sketches can only answer queries about the current status of the graph stream. In this paper, we aim at designing persistent graph sketches to support graph queries in any given time range in the past. To this end, we first introduce a baseline method by extending an existing graph summarization method. However, our empirical study suggests that the accuracy performance of the baseline method is unsatisfactory, especially when the query time interval is large. To tackle this issue, we propose a new method PGSS-BDH, which stores the streaming edges using a set of hierarchically organized hashmaps. When a query arrives, we divide the query time interval into a set of disjoint sub-intervals and return the sum of query results on all sub-intervals as the overall query answer. Observing that PGSS-BDH bears a linear space cost to the graph stream size, we further propose an advance method PGSS-MDC by using a set of fixed-size hierarchical counters to store the weight of edges, where the query processing is similar to PGSS-BDH. We theoretically analyze the accuracy performance of PGSS-BDH and PGSS-MDC. The experiment results on real-life datasets demonstrate that PGSS-MDC can return much more accurate answers than the competitors by consuming comparable query time and much less memory. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
45. Advanced algorithms for optimal meeting points in road networks
- Author
-
Kaijun Liu, Jianming Liu, and Jingwei Zhang
- Subjects
transportation ,geographic information systems ,query processing ,Transportation engineering ,TA1001-1280 ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Abstract An optimal meeting point query is used to determine a location in a spatial region to build a new facility that minimizes the sum of the (weighted) road distances from all clients. This problem has been studied in previous work with the assumption that all clients and facilities reside in Euclidean space or along road networks. However, due to the limitations of geographic information system technologies, it is difficult to return an exact geographic location to answer the optimal meeting point query based on a set of arbitrary coordinates. This issue results in various problems, such as positioning and measurement errors, in practical use. In this paper, it is aimed to identify the optimal meeting point in road networks for clients and facilities residing in non‐Euclidean spaces. Two efficient heuristic solutions are proposed based on approximate and adaptive query processing techniques by using randomized adaptive search and random direction search methods, respectively, to rapidly converge to the global optimum in the geographic coordinate system. Extensive experiments based on real datasets demonstrate that our proposed method achieves a 32.11% improvement over the state‐of‐the‐art approach.
- Published
- 2023
- Full Text
- View/download PDF
46. Early Exit Strategies for Learning-to-Rank Cascades
- Author
-
Francesco Busolin, Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, and Salvatore Trani
- Subjects
Query processing ,efficiency/effectiveness trade-offs ,learning-to-rank ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
The ranking pipelines of modern search platforms commonly exploit complex machine-learned models and have a significant impact on the query response time. In this paper, we discuss several techniques to speed up the document scoring process based on large ensembles of decision trees without hindering ranking quality. Specifically, we study the problem of document early exit within the framework of a cascading ranker made of three components: 1) an efficient but sub-optimal ranking stage; 2) a pruner that exploits signals from the previous component to force the early exit of documents classified as not relevant; and 3) a final high-quality component aimed at finely ranking the documents that survived the previous phase. To maximize speedup and preserve effectiveness, we aim to increase the accuracy of the pruner in identifying non-relevant documents without early exiting documents that are likely to be ranked among the final top- $k$ results. We propose an in-depth study of heuristic and machine-learning techniques for designing the pruner. While the heuristic technique only exploits the score/ranking information supplied by the first sub-optimal ranker, the machine-learned solution named LEAR uses these signals as additional features along with those representing query-document pairs. Moreover, we study alternative solutions to implement the first ranker, either a small prefix of the original forest or an auxiliary machine-learned ranker explicitly trained for this purpose. We evaluated our techniques through reproducible experiments using publicly available datasets and state-of-the-art competitors. The experiments confirm that our early-exit strategies achieve speedups ranging from $3\times $ to $10\times $ without statistically significant differences in effectiveness.
- Published
- 2023
- Full Text
- View/download PDF
47. Query Refinement into Information Retrieval Systems: An Overview
- Author
-
Mawloud Mosbah
- Subjects
Information Retrieval ,Query Processing ,Automatic Query Formulation ,Query Reformulation ,Query Expansion ,Information theory ,Q350-390 - Abstract
Query, expressing the user need and requirement, has an important role, in an information retrieval system, for reaching a high accuracy search. In this paper, we present an overview of the different refinement operations that the query may undergo, in the sake to enhance performance of an information retrieval system, such as: automatic query formulation through words prevision, query reformulation, query expansion, and query optimization.
- Published
- 2023
- Full Text
- View/download PDF
48. NALMO: Transforming Queries in Natural Language for Moving Objects Databases.
- Author
-
Wang, Xieyang, Liu, Mengyi, Xu, Jianqiu, and Lu, Hua
- Subjects
- *
NATURAL languages , *MOBILE commerce , *SATISFACTION , *CONFERENCE papers , *DATABASES - Abstract
Moving objects databases (MODs) have been extensively studied due to their wide variety of applications including traffic management, tourist service and mobile commerce. However, queries in natural languages are still not supported in MODs. Since most users are not familiar with structured query languages, it is essentially important to bridge the gap between natural languages and the underlying MODs system commands. Motivated by this, we design a natural language interface for moving objects, named NALMO. In general, we use semantic parsing in combination with a location knowledge base and domain-specific rules to interpret natural language queries. We design a corpus of moving objects queries for model training, which is later used to determine the query type. Extracted entities from parsing are mapped through deterministic rules to perform query composition. NALMO is able to well translate moving objects queries into structured (executable) languages. We support five kinds of queries including time interval queries, range queries, nearest neighbor queries, trajectory similarity queries and join queries. We develop the system in a prototype system SECONDO and evaluate our approach using 280 natural language queries extracted from popular conference and journal papers in the domain of moving objects. Four volunteers give the system satisfaction and related suggestions through three rounds of independent tests. Experimental results show that (i) NALMO achieves accuracy and precision 96.8% and 81.1%, respectively, (ii) the average time cost of translating a query is 1.49s, and (iii) the average satisfaction is 95.5%. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
49. Durable queries over non-synchronized temporal data.
- Author
-
Xie, Yanqi, Weng, Wei, and Li, Jianmin
- Subjects
- *
ACQUISITION of data , *ALGORITHMS - Abstract
Temporal data are ubiquitous nowadays and efficient management of temporal data is of key importance. A temporal data typically describes the evolution of an object over time. One of the most useful queries over temporal data are the durable top-k queries. Given a time window, a durable top-k query finds the objects that are frequently among the best. Existing solutions to durable top-k queries assume that all temporal data are sampled at the same time points (i.e., at any time, there is a corresponding observed value for every temporal data). However, in many practical applications, temporal data are collected from multiple data sources with different sampling rates. In this light, we investigate the efficient processing of durable top-k queries over temporal data with different sampling rates. We propose an efficient sweep line algorithm to process durable top-k queries over non-synchronized temporal data. We conduct extensive experiments on two real datasets to test the performance of our proposed method. The results show that our methods outperforms the baseline solutions by a large margin. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
50. A generic framework for efficient computation of top-k diverse results.
- Author
-
Islam, Md Mouinul, Asadi, Mahsa, Amer-Yahia, Sihem, and Roy, Senjuti Basu
- Abstract
Result diversification is extensively studied in the context of search, recommendation, and data exploration. There are numerous algorithms that return top-k results that are both diverse and relevant. These algorithms typically have computational loops that compare the pairwise diversity of records to decide which ones to retain. We propose an access primitive DivGetBatch() that replaces repeated pairwise comparisons of diversity scores of records by pairwise comparisons of "aggregate" diversity scores of a group of records, thereby improving the running time of these algorithms while preserving the same results. We integrate the access primitive inside three representative diversity algorithms and prove that the augmented algorithms leveraging the access primitive preserve original results. We analyze the worst and expected case running times of these algorithms. We propose a computational framework to design this access primitive that has a pre-computed index structure I-tree that is agnostic to the specific details of diversity algorithms. We develop principled solutions to construct and maintain I-tree. Our experiments on multiple large real-world datasets corroborate our theoretical findings, while ensuring up to a 24 × speedup. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.