268 results on '"Database Systems"'
Search Results
2. Classification of Cybersecurity Threats, Vulnerabilities and Countermeasures in Database Systems.
- Author
-
Almaiah, Mohammed Amin, Saqr, Leen Mohammad, Al-Rawwash, Leen Ahmad, Altellawi, Layan Ahmed, Al-Ali, Romel, and Almomani, Omar
- Subjects
DATABASES ,DENIAL of service attacks ,CYBERTERRORISM ,AUDIT trails ,INFORMATION storage & retrieval systems ,RANSOMWARE - Abstract
Database systems have consistently been prime targets for cyber-attacks and threats due to the critical nature of the data they store. Despite the increasing reliance on database management systems, this field continues to face numerous cyber-attacks. Database management systems serve as the foundation of any information system or application. Any cyber-attack can result in significant damage to the database system and loss of sensitive data. Consequently, cyber risk classifications and assessments play a crucial role in risk management and establish an essential framework for identifying and responding to cyber threats. Risk assessment aids in understanding the impact of cyber threats and developing appropriate security controls to mitigate risks. The primary objective of this study is to conduct a comprehensive analysis of cyber risks in database management systems, including classifying threats, vulnerabilities, impacts, and countermeasures. This classification helps to identify suitable security controls to mitigate cyber risks for each type of threat. Additionally, this research aims to explore technical countermeasures to protect database systems from cyber threats. This study employs the content analysis method to collect, analyze, and classify data in terms of types of threats, vulnerabilities, and countermeasures. The results indicate that SQL injection attacks and Denial of Service (DoS) attacks were the most prevalent technical threats in database systems, each accounting for 9% of incidents. Vulnerable audit trails, intrusion attempts, and ransomware attacks were classified as the second level of technical threats in database systems, comprising 7% and 5% of incidents, respectively. Furthermore, the findings reveal that insider threats were the most common non-technical threats in database systems, accounting for 5% of incidents. Moreover, the results indicate that weak authentication, unpatched databases, weak audit trails, and multiple usage of an account were the most common technical vulnerabilities in database systems, each accounting for 9% of vulnerabilities. Additionally, software bugs, insecure coding practices, weak security controls, insecure networks, password misuse, weak encryption practices, and weak data masking were classified as the second level of security vulnerabilities in database systems, each accounting for 4% of vulnerabilities. The findings from this work can assist organizations in understanding the types of cyber threats and developing robust strategies against cyber-attacks. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. A Forensic Framework for gathering and analyzing Database Systems using Blockchain Technology.
- Author
-
Alzahrani, Ahmed Omar, Al-Khasawneh, Mahmoud Ahmad, Alarood, Ala Abdulsalam, and Alsolami, Eesa
- Subjects
DATABASES ,BLOCKCHAINS ,DISTRIBUTED databases ,ELECTRONIC evidence ,DESIGN science ,TRANSACTION records - Abstract
A blockchain is a distributed database that contains the records of transactions that are shared among all members of a community. Most members must confirm each and every transaction in order for a fraudulent transaction to fail to occur. As a rule, once a record is created and accepted by the blockchain, it cannot be altered or deleted by anyone. This study focuses on improving the investigation task in the database forensics field by utilizing blockchain technology. To this end, a novel conceptual framework is proposed for the forensic analysis of data from database systems engaging blockchain technology. This is the first time that blockchain technology is followed in database forensics for the purpose of tracing digital evidence. The design science research method was adopted to accomplish the objectives of the present study. The findings displayed that with the developed forensics framework, the data regarding database incidents could be gathered and analyzed in a more efficient manner. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. Learning From User-Specified Optimizer Hints in Database Systems.
- Author
-
Zakrzewicz, Maciej
- Subjects
DATABASES ,DATABASE management ,PERFORMANCE management ,MACHINE learning ,MASK laws ,SQL - Abstract
Recently, numerous machine learning (ML) techniques have been applied to address database performance management problems, including cardinality estimation, cost modeling, optimal join order prediction, hint generation, etc. In this paper, we focus on query optimizer hints employed by users in their queries in order to mask some Query Optimizer deficiencies. We treat the query optimizer hints, bound to previous queries, as significant additional query metadata and learn to automatically predict which new queries will pose similar performance challenges and should therefore also be supported by query optimizer hints. To validate our approach, we have performed a number of experiments using real-life SQL workloads and we achieved promising results. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. Digital Forensics Readiness Framework (DFRF) to Secure Database Systems.
- Author
-
Albugmi, Ahmed
- Subjects
DATABASES ,DIGITAL forensics ,DATA recovery ,DATABASE security ,PREPAREDNESS ,COMPUTER passwords ,DATA encryption - Abstract
Database systems play a significant role in structuring, organizing, and managing data of organizations. In this regard, the key challenge is how to protect the confidentiality, integrity, and availability of database systems against attacks launched from within and outside an organization. To resolve this challenge, different database security techniques and mechanisms, which generally involve access control, database monitoring, data encryption, database backups, and strong passwords have been proposed. These techniques and mechanisms have been developed for certain purposes but fall short of many industrial expectations. This study used the design science research method to recommend a new Digital Forensic Readiness Framework, named DFRF, to secure database systems. DFRF involves risk assessments, data classification, database firewalls, data encryption, strong password policies, database monitoring and logging, data backups and recovery, incident response plans, forensic readiness, as well as education and awareness. The proposed framework not only identifies threats and responds to them more effectively than existing models, but also helps organizations stay fully compliant with regulatory requirements and improve their security. The design of the suggested framework was compared with existing models, confirming its superiority. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. Teaching computing and experiencing grief in the year 1 AC (after ChatGPT).
- Author
-
Geller, James
- Subjects
CHATGPT ,SCIENTIFIC apparatus & instruments ,DATABASES ,GRIEF ,BEREAVEMENT - Abstract
This article explores the impact of ChatGPT, an advanced language model, on technology and education. It compares the current revolution brought about by ChatGPT to previous technological revolutions and discusses the stages of grief experienced by individuals when faced with the obsolescence of old technologies. The author emphasizes the need for acceptance and adaptation in the face of technological advancements. The text also highlights the impact of ChatGPT on education and suggests strategies for incorporating it into teaching methodologies. It emphasizes the opportunities and possibilities that ChatGPT brings to education, while acknowledging the need for educators to accept and embrace this new technology. [Extracted from the article]
- Published
- 2024
- Full Text
- View/download PDF
7. GAN-Based Tabular Data Generator for Constructing Synopsis in Approximate Query Processing: Challenges and Solutions.
- Author
-
Fallahian, Mohammadali, Dorodchi, Mohsen, and Kreth, Kyle
- Subjects
GENERATIVE adversarial networks ,DATABASES ,DATA distribution - Abstract
In data-driven systems, data exploration is imperative for making real-time decisions. However, big data are stored in massive databases that are difficult to retrieve. Approximate Query Processing (AQP) is a technique for providing approximate answers to aggregate queries based on a summary of the data (synopsis) that closely replicates the behavior of the actual data; this can be useful when an approximate answer to queries is acceptable in a fraction of the real execution time. This study explores the novel utilization of a Generative Adversarial Network (GAN) for the generation of tabular data that can be employed in AQP for synopsis construction. We thoroughly investigate the unique challenges posed by the synopsis construction process, including maintaining data distribution characteristics, handling bounded continuous and categorical data, and preserving semantic relationships, and we then introduce the advancement of tabular GAN architectures that overcome these challenges. Furthermore, we propose and validate a suite of statistical metrics tailored for assessing the reliability of GAN-generated synopses. Our findings demonstrate that advanced GAN variations exhibit a promising capacity to generate high-fidelity synopses, potentially transforming the efficiency and effectiveness of AQP in data-driven systems. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. Comparison of SQL, NoSQL and TSDB database systems for smart buildings and smart metering applications.
- Author
-
WŁOSTOWSKA, Sandra, SZABELA, Julia, CHOJECKI, Adrian, and BORKOWSKI, Piotr
- Subjects
DATABASES ,NONRELATIONAL databases ,SMART meters ,RELATIONAL databases ,INTELLIGENT buildings ,PYTHON programming language ,SQL - Abstract
Copyright of Przegląd Elektrotechniczny is the property of Przeglad Elektrotechniczny and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2023
- Full Text
- View/download PDF
9. Towards Conceptualization Of A Prototype For Quantum Database: A Complete Ecosystem.
- Author
-
Chakraborty, Sayantan
- Subjects
DATABASES ,QUANTUM computers ,COMPUTER systems ,QUANTUM computing ,DATABASE management ,INFORMATION storage & retrieval systems - Abstract
This study proposes a conceptualization of a prototype And a possibility to converge classical database and fully quantum database. This study mostly identifies the gap between this classical and quantum database and proposes a prototype that can be implemented in future products. It is a way that can be used in future industrial product development on hybrid quantum computers. The existing concept used to consider oracle as a black box in this study opens up the possibility for the quantum industry to develop the QASAM module so that we can create a fully quantum database instead of using a classical database as BlackBox. As the Toffoli gate is basically an effective NAND gate it is possible to run any algorithm theoretically in quantum computers. So we will propose a logical design for memory management for the quantum database, security enhancement model, Quantum Recovery Manager & automatic storage management model, and more for the quantum database which will ensure the quantum advantages. In this study, we will also explain the Quantum Vector Database as well as the possibility of improvement in duality quantum computing. It opens up a new scope, possibilities, and research areas in a new approach for quantum databases and duality quantum computing. CCS Concepts: • Computer systems organization? Quantum computing; • Software and its engineering? Software infrastructure; • General and reference? Reference works; General literature; General literature; Reference works; Performance; • Information systems? DBMS engine architectures; Main memory engines; Key-value stores; Database utilities and tools; • Hardware? Quantum technologies;. [ABSTRACT FROM AUTHOR]
- Published
- 2023
10. Generalized linear models for massive data via doubly-sketching.
- Author
-
Hou-Liu, Jason and Browne, Ryan P.
- Abstract
Generalized linear models are a popular analytics tool with interpretable results and broad applicability, but require iterative estimation procedures that impose data transfer and computational costs that can be problematic under some infrastructure constraints. We propose a doubly-sketched approximation of the iteratively re-weighted least squares algorithm to estimate generalized linear model parameters using a sequence of surrogate datasets. The procedure sketches once to reduce data transfer costs, and sketches again to reduce data computation costs, yielding wall-clock time savings. Regression coefficients and standard errors are produced, with comparison against literature methods. Asymptotic properties of the proposed procedure are shown, with empirical results from simulated and real-world datasets. The efficacy of the proposed method is investigated across a variety of commodity computational infrastructure configurations accessible to practitioners. A highlight of the present work is the estimation of a Poisson-log generalized linear model across almost 1.7 billion observations on a personal computer in 25 min. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
11. Explainable AI for DBA: Bridging the DBA's experience and machine learning in tuning database systems.
- Author
-
Ouared, Abdelkader, Amrani, Moussa, and Schobbens, Pierre‐Yves
- Subjects
DATABASES ,MACHINE learning ,ARTIFICIAL intelligence ,RANDOM forest algorithms ,DECISION trees - Abstract
Summary: Recently artificial intelligence techniques in the database community have become a driver for many database applications. The proposed solution adopting AI in the core database shows that incorporating AI improves the query processing and the self‐tuning of database systems. In traditional systems, self‐tuning database systems are commonly addressed with heuristics to suggest the physical structures (e.g., creation of indexes and materialized views) that enable the fastest execution of queries. However, existing designer tools do not explain/justify how the system behaves and the reasoning behind tuning activities. Moreover, these tools do not keep the database administrator (DBA) in the loop of the optimization process to trust some of the automatic tuning decisions. To address this problem, we introduce a framework called Explain‐Tun that enables to predict and explain self‐tuning actions with transparent strategy from historical data using two explicit models, that is, decision tree and random forests. First, we propose AI‐based DBMS to explain how to select physical structures and provide decision rules extracted by machine learning (ML) as a designed plug‐gable component. Second, a goal‐oriented model to keep DBA in the loop of the optimization process in order to manipulate ML models as CRUD entities. Finally, we evaluate our approach on three use cases, results show that bridging the DBA's experience and ML make sense in tuning database systems. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
12. Mochi: A Case Study in Translational Computer Science for High-Performance Computing Data Management.
- Author
-
Carns, Philip, Dorier, Matthieu, Latham, Rob, Ross, Robert B., Snyder, Shane, Soumagne, Jerome, Parashar, Manish, and Abramson, David
- Subjects
DATA management ,HIGH performance computing ,DATABASES ,COMPUTER science ,PROBLEM solving - Abstract
High-performance computing (HPC) has become an indispensable tool for solving diverse problems in science and engineering. Harnessing the power of HPC is not just a matter of efficient computation, however; it also calls for the efficient management of vast quantities of scientific data. This presents daunting challenges: rapidly evolving storage technology has motivated a shift toward increasingly complex, heterogeneous storage architectures that are difficult to optimize, and scientific data management needs have become every bit as diverse as the application domains that drive them. There is a clear need for agile, adaptable storage solutions that can be customized for the task and platform at hand. This motivated the establishment of the Mochi composable data service project. The Mochi project provides a library of robust, reusable, modular, and connectable data management components and microservices along with a methodology for composing them into specialized distributed data services. Mochi enables rapid deployment of custom data services with a high degree of developer productivity while still effectively leveraging cutting-edge HPC hardware. This article explores how the principles of translational computer science have been applied in practice in Mochi to achieve these goals. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
13. Energy-Efficient Database Systems: A Systematic Survey.
- Author
-
BINGLEI GUO, JIONG YU, DEXIAN YANG, HONGYONG LENG, and BIN LIAO
- Subjects
DATABASES ,DATABASE management ,CONSUMPTION (Economics) ,ENERGY consumption ,BIG data - Abstract
Constructing energy-efficient database systems to reduce economic costs and environmental impact has been studied for 10 years. With the emergence of the big data age, along with the data-centric and data-intensive computing trend, the great amount of energy consumed by database systems has become a major concern in a society that pursues Green IT. However, to the best of our knowledge, despite the importance of this matter in Green IT, there have been few comprehensive or systematic studies conducted in this field. Therefore, the objective of this article is to present a literature survey with breadth and depth on existing energy management techniques for database systems. The existing literature is organized hierarchically with two major branches focusing separately on energy consumption models and energy-saving techniques. Under each branch, we first introduce some basic knowledge, then we classify, discuss, and compare existing research according to their core ideas, basic approaches, and main characteristics. Finally, based on these observations through our study, we identify multiple open issues and challenges, and provide insights for future research. It is our hope that our outcome of this work will help researchers develop more energy-efficient database systems. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
14. FragTracer: Real-Time Fragmentation Monitoring Tool for F2FS File System.
- Author
-
Cho, Minseon and Kang, Donghyun
- Subjects
COMPUTER performance ,COMPUTER systems ,DATABASES - Abstract
Emerging hardware devices (e.g., NVMe SSD, RISC-V, etc.) open new opportunities for improving the overall performance of computer systems. In addition, the applications try to fully utilize hardware resources to keep up with those improvements. However, these trends can cause significant file system overheads (i.e., fragmentation issues). In this paper, we first study the reason for the fragmentation issues on an F2FS file system and present a new tool, called FragTracer, which helps to analyze the ratio of fragmentation in real-time. For user-friendly usage, we designed FragTracer with three main modules, monitoring, pre-processing, and visualization, which automatically runs without any user intervention. We also optimized FragTracer in terms of performance to hide its overhead in tracking and analyzing fragmentation issues on-the-fly. We evaluated FragTracer with three real-world databases on the F2FS file system, so as to study the fragmentation characteristics caused by databases, and we compared the overhead of FragTracer. Our evaluation results clearly show that the overhead of FragTracer is negligible when running on commodity computing environments. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
15. Postnatal corticosteroid exposure in very preterm infants: A French cohort study.
- Author
-
Iacobelli, Silvia, Allamèle-Moutama, Käliani, Lorrain, Simon, Gouyon, Béatrice, Gouyon, Jean-Bernard, and Bonsante, Francesco
- Subjects
PREMATURE infants ,FRENCH people ,DYSPLASIA ,COHORT analysis ,WEIGHT in infancy ,BIRTH weight ,BRONCHOPULMONARY dysplasia ,CORTICOSTEROIDS - Abstract
Background: Postnatal corticosteroids (PC) are widely used in very preterm infants. International reports and national multicenter trials describe a marked variability across countries and inter-sites, in the use of PC. Few information is available on therapeutic indications and prescription characteristics of PC. Aim: The main objective of this study was to describe the exposure to PC in a large cohort of preterm infants born at less than 32 weeks of gestation, according to the prescription data of 41 tertiary-care NICUs in France. Secondary objectives were to describe therapeutic indications, day of life (DOL) of the first exposure, route of administration, duration, cumulative dose for each drug, and differences in exposure rates across centers. Methods: We conducted a prospective observational cohort analysis from January 2017 to December 2021, in 41 French tertiary-care NICUs using the same computerized order-entry system. Results: In total, 13,913 infants [birth weight 1144.8 (±365.6) g] were included. Among them, 3633 (26.1%) were exposed to PC, 21.8% by systemic and 10.1% by inhaled route. Within the study population, 1,992 infants (14.3%) received the first corticosteroid treatment in the first week of life and 1641 (11.8%) after DOL 7. The more frequent indications were prevention and/or treatment of bronchopulmonary dysplasia, and arterial hypotension. Hydrocortisone was the more often prescribed molecule. For systemic PC the first exposure occurred in mean at DOL 9.4 (±13.5), mean duration of treatment was 10.3 (±14.3) days, and the cumulative dose (expressed as the equivalent dose of hydrocortisone) was in median [IQR] 9.0 [5.5-28.8] mg/kg. For inhaled PC, the first exposure occurred in mean at DOL 34.1 (±19.7), and mean duration of treatment 28.5 (±24.4) days. The exposure rate ranged from a minimum of 5% to a maximum of 56% among centers, and significantly increased over the study period (p < 0.0001). Conclusion: In this French cohort of very preterm infants, around one patient out to five was exposed to PC during hospital stay in the NICU. The exposure occurred early, starting from the first week of life. Exposure rate widely varied among centers. Pharmacoepidemiology studies are useful to increase knowledge on corticosteroid utilization patterns in preterm infants. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
16. Spatio-Temporal Semantic Data Model for Precision Agriculture IoT Networks.
- Author
-
San Emeterio de la Parte, Mario, Lana Serrano, Sara, Muriel Elduayen, Marta, and Martínez-Ortega, José-Fernán
- Subjects
DATA modeling ,INTERNET of things ,DATA management ,CROP management ,DATABASES ,PRECISION farming - Abstract
In crop and livestock management within the framework of precision agriculture, scenarios full of sensors and devices are deployed, involving the generation of a large volume of data. Some solutions require rapid data exchange for action or anomaly detection. However, the administration of this large amount of data, which in turn evolves over time, is highly complicated. Management systems add long-time delays to the spatio-temporal data injection and gathering. This paper proposes a novel spatio-temporal semantic data model for agriculture. To validate the model, data from real livestock and crop scenarios, retrieved from the AFarCloud smart farming platform, are modeled according to the proposal. Time-series Database (TSDB) engine InfluxDB is used to evaluate the model against data management. In addition, an architecture for the management of spatio-temporal semantic agricultural data in real-time is proposed. This architecture results in the DAM&DQ system responsible for data management as semantic middleware on the AFarCloud platform. The approach of this proposal is in line with the EU data-driven strategy. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
17. A Comprehensive Review of Security Measures in Database Systems: Assessing Authentication, Access Control, and Beyond.
- Author
-
Omotunde, Habeeb and Ahmed, Maryam
- Subjects
DATABASE security ,DATABASES ,ACCESS control ,DATA security ,BIG data ,INTERNET of things - Abstract
This paper presents a comprehensive review of security measures in database systems, focusing on authentication, access control, encryption, auditing, intrusion detection, and privacy-enhancing techniques. It aims to provide valuable insights into the latest advancements and best practices in securing databases. The review examines the challenges, vulnerabilities, and mitigation strategies associated with database security. It explores various authentication methods, access control models, encryption techniques, auditing and monitoring approaches, intrusion detection systems, and data leakage prevention mechanisms. The paper also discusses the impact of emerging trends such as cloud computing, big data, and the Internet of Things on database security. By synthesizing existing research, this review aims to contribute to the advancement of database security and assist organizations in protecting their sensitive data. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
18. Cross-sectional survey of barriers and opportunities for engaging backyard poultry producers and veterinarians in addressing poultry health.
- Author
-
Greening, SS and Gates, MC
- Subjects
HEALTH information systems ,POULTRY ,VETERINARIANS ,TRAINING of veterinarians ,CLINICS ,FRONT yards & backyards - Abstract
To assess the current level of engagement between backyard poultry keepers and veterinarians in New Zealand; to understand the opportunities and barriers for improving access to poultry health care; and to gauge the interest of backyard poultry keepers in participating in a voluntary national poultry health information system. Backyard poultry were defined as any bird species kept for non-commercial purposes. Separate cross-sectional surveys were administered to backyard poultry keepers and veterinarians in New Zealand over 12-week periods starting 22 March 2021 and 03 May 2021 respectively. The veterinarian survey was advertised in the monthly update e-mail from the Veterinary Council of New Zealand, while the survey for backyard poultry keepers was advertised on various online platforms that focus on raising backyard poultry. Results for quantitative variables were reported as basic descriptive statistics, while qualitative free-text responses from open-ended questions were explored using thematic analysis. A total of 125 backyard poultry keepers and 35 veterinarians completed the survey. Almost half (56/125; 44.8%) of backyard poultry keepers reported that they had never taken their birds to a veterinarian, with common reasons being difficulty finding a veterinarian, cost of treatment, and perceptions that most visits result in the bird being euthanised. The majority (113/125; 90.4%) of backyard poultry keepers reported that a general internet search was their primary source for poultry health advice. However, it remains unclear if owners were satisfied with the advice found online, as many cited that having access to reliable health information would be an incentive for registering with a poultry health information system. Of the veterinarian responses, 29/35 (82.9%) reported treating an increasing number of poultry in the last 5 years, although many (27/35; 77.1%) suggested they would be hesitant to increase their poultry caseload due to concerns over their lack of knowledge and confidence in poultry medicine; a lack of clinic resources to treat poultry; concerns over the cost-effectiveness of treatments; and a general feeling of helplessness when treating poultry, with most consultations being for end-stage disease and euthanasia. The results of this study highlight opportunities for increased engagement between backyard poultry keepers and veterinarians, including making available accurate poultry health information and providing veterinarians with improved training in poultry medicine. The results also support the development of a poultry health information system in New Zealand to further enhance health and welfare in backyard poultry populations. Abbreviations: MPI: Ministry for Primary Industries [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
19. Big Data Pipelines on the Computing Continuum: Tapping the Dark Data.
- Author
-
Roman, Dumitru, Prodan, Radu, Nikolov, Nikolay, Soylu, Ahmet, Matskin, Mihhail, Marrella, Andrea, Kimovski, Dragi, Elvesaeter, Brian, Simonet-Boulogne, Anthony, Ledakis, Giannis, Song, Hui, Leotta, Francesco, and Kharlamov, Evgeny
- Subjects
BIG data ,DATABASES ,DATA integrity - Abstract
The computing continuum enables new opportunities for managing big data pipelines concerning efficient management of heterogeneous and untrustworthy resources. We discuss the big data pipelines lifecycle on the computing continuum and its associated challenges, and we outline a future research agenda in this area. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
20. To share or not to share vector registers?
- Author
-
Pietrzyk, Johannes, Krause, Alexander, Habich, Dirk, and Lehner, Wolfgang
- Abstract
Query execution techniques in database systems constantly adapt to novel hardware features to achieve high query performance, in particular for analytical queries. In recent years, vectorization based on the Single Instruction Multiple Data parallel paradigm has been established as a state-of-the-art approach to increase single-query performance. However, since concurrent analytical queries running in parallel often access the same columns and perform a same set of vectorized operations, data accesses and computations among different queries may be executed redundantly. Various techniques have already been proposed to avoid such redundancy, ranging from concurrent scans via the construction of materialized views to applying multiple query optimization techniques. Continuing this line of research, we investigate the opportunity of sharing vector registers for concurrently running queries in analytical scenarios in this paper. In particular, our novel sharing approach relies on processing data elements of different queries together within a single vector register. As we are going to show, sharing vector registers to optimize the execution of concurrent analytical queries can be very beneficial in single-threaded as well as multi-thread environments. Therefore, we demonstrate the feasibility and applicability of such a novel work sharing strategy and thus open up a wide spectrum of future research opportunities. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
21. Heuristic Pairwise Alignment in Database Environments.
- Author
-
Lipták, Panna, Kiss, Attila, and Szalai-Gindl, János Márk
- Subjects
DATABASES ,DNA data banks ,COMPARATIVE method ,ELECTRONIC data processing ,HEURISTIC ,BIOINFORMATICS software - Abstract
Biological data have gained wider recognition during the last few years, although managing and processing these data in an efficient way remains a challenge in many areas. Increasingly, more DNA sequence databases can be accessed; however, most algorithms on these sequences are performed outside of the database with different bioinformatics software. In this article, we propose a novel approach for the comparative analysis of sequences, thereby defining heuristic pairwise alignment inside the database environment. This method takes advantage of the benefits provided by the database management system and presents a way to exploit similarities in data sets to quicken the alignment algorithm. We work with the column-oriented MonetDB, and we further discuss the key benefits of this database system in relation to our proposed heuristic approach. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
22. Block the Chain: Software Weapons of Fighting Against COVID-19.
- Subjects
COVID-19 ,COMPUTER software ,COMPUTER architecture ,INFORMATION architecture ,DATA management ,BLOCKCHAINS - Abstract
This article proposes an architecture for vaccination information validation and tracking with a fog and cloud-based blockchain system, providing a privacy-aware and scalable approach for interoperable and effective data management. It evaluates the scalability of the underlying blockchain system by means of simulation. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
23. LogStore: A Workload-Aware, Adaptable Key-Value Store on Hybrid Storage Systems.
- Author
-
Menon, Prashanth, Qadah, Thamir M., Rabl, Tilmann, Sadoghi, Mohammad, and Jacobsen, Hans-Arno
- Subjects
SOLID state drives ,COMPACTING ,DATABASES - Abstract
Due to recent explosion of data volume and velocity, a new array of lightweight key-value stores have emerged to serve as alternatives to traditional databases. The majority of these storage engines, however, sacrifice their read performance in order to cope with write throughput by avoiding random disk access when writing a record in favor of fast sequential accesses. But, the boundary between sequential versus random access is becoming blurred with the advent of solid-state drives (SSDs). In this work, we propose our new key-value store, LogStore, optimized for hybrid storage architectures. Additionally, introduce a novel cost-based data staging model based on log-structured storage, in which recent changes are first stored on SSDs, and pushed to HDD as it ages, while minimizing the read/write amplification for merging data from SSDs and HDDs. Furthermore, we take a holistic approach in improving both the read and write performance by dynamically optimizing the data layout, such as deferring and reversing the compaction process, and developing an access strategy to leverage the strengths of each available medium in our storage hierarchy. Lastly, in our extensive evaluation, we demonstrate that LogStore achieves up to 6x improvement in throughput/latency over LevelDB, a state-of-the-art key-value store. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
24. Vulnerability of Saudi Private Sector Organisations to Cyber Threats and Methods to Reduce the Vulnerability.
- Author
-
Shafie, Emad
- Subjects
CYBERTERRORISM ,PRIVATE sector ,INTERNET security ,COMPUTER crime prevention ,SECURITY management ,PERCENTILES ,COMPUTER security - Abstract
The Middle Eastern region has witnessed many cyber-attacks in recent years, especially in Saudi Arabia. Saudi Arabian organisations face problems anticipating, detecting, mitigating, or preventing cyber-attacks despite policies and regulations. The reasons for this have not been investigated adequately. This research aims to study the methods used to address cyber security issues in the private sector. A survey of IT managers of private organisations yielded 230 usable responses. The data were analysed for descriptive statistics and frequency estimations of responses, and the results are presented in this paper. Poor awareness of cyber security issues is reflected in the survey responses. The expenditure on cyber security, especially by large firms, was inadequate. There was a greater tendency to outsource many aspects of cyber security without concern about the risks. A very small percentage of IT managers considered the certainty of a cyber threat within the next year. It is important from the point of proactive strategies to prevent attacks. The findings highlight a lack of required knowledge and skills in performing their expected roles well. Additionally, many weaknesses have been detected in cyber security management in Saudi private organisations, and there is room to improve the quality of computer security systems. The published literature largely supported this. The findings from this study have implications for the stakeholders, especially IT managers working in the private sector of Saudi Arabia. The learnings from this study may be used to address the vulnerabilities identified. The findings clearly show the need to train IT managers of Saudi private organisations. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
25. Performance Evaluation of Distributed Database Strategies Using Docker as a Service for Industrial IoT Data: Application to Industry 4.0.
- Author
-
Gkamas, Theodosios, Karaiskos, Vasileios, and Kontogiannis, Sotirios
- Subjects
DISTRIBUTED databases ,INDUSTRY 4.0 ,INTERNET of things ,STEVEDORES ,CLOUD storage - Abstract
Databases are an integral part of almost every application nowadays. For example, applications using Internet of Things (IoT) sensory data, such as in Industry 4.0, are a classic example of an organized storage system. Due to its enormous size, it may be stored in the cloud. This paper presents the authors' proposition for cloudcentric sensory measurements and measurements acquisition. Then, it focuses on evaluating industrial cloud storage engines for sensory functions, experimenting with three open-source types of distributed Database Management Systems (DBMS); MongoDB and PostgreSQL, with two forms of PostgreSQL schemes (Javascript Object Notation (JSON)-based and relational), against their respective horizontal scaling strategies. Several experimental cases have been performed to measure database queries' response time, achieved throughput, and corresponding failures. Three distinct scenarios have been thoroughly tested, the most common but widely used: (i) data insertions, (ii) select/find queries, and (iii) queries related to aggregate correlation functions. The experimental results concluded that PostgreSQL with JSON achieves a 5–57% better response than MongoDB for the insert queries (cases of native, two, and four shards implementations), while, on the contrary, MongoDB achieved 56–91% higher throughput than PostgreSQL for the same set up. Furthermore, for the data insertion experimental cases of six and eight shards, MongoDB performed 13–20% more than Postgres in response time, achieving × 2 times higher throughput. Relational PostgreSQL was × 2 times faster than MongoDB in its standalone implementation for selection queries. At the same time, MongoDB achieved 19–31% faster responses and 44–63% higher throughput than PostgreSQL in the four tested sharding subcases (two, four, six, eight shards), accordingly. Finally, the relational PostgreSQL outperformed MongoDB and PostgreSQL JSON significantly in all correlation function experiments, with performance improvements from MongoDB, closing the gap with PostgreSQL towards minimizing response time to 26% and 3% for six and eight shards, respectively, and achieving significant gains towards average achieved throughput. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
26. From Data Science to Data Resistance: Possible Methods to Stop a Global Tyrant to Come.
- Author
-
Christianto, Victor
- Subjects
DATA science ,DICTATORS ,DATABASES ,BIG data - Abstract
The present article is actually a result of various discussions with a number of colleagues, on what will come later after this pandemic. Recent studies by Prof Fioranelli and also Rubik clearly indicate that something bigger plan is played behind the scene. And Paul Levy's recent book warns us that actually the real plague is a kind of spiritual disease called "wetiko." As a wise word tells us: "It is not how good you play the game (i.e. how to reduce the spreading of pandemic), but it is how the game is played against you." Therefore, let us ask: how we can respond properly, in a dignified way, while we shall keep our moral integrity, i.e. not to step down into violence means (i.e. ahimsa way). It turns out, that one of available course of action, is to turn the notion of Data Science to become Data Resistance. We outlined several available methods, but let us put a cautious remark here that we shall not use big data method unless it is the last choice, because once we also partake into that bigdata madness, then actually we also use the same evil tools against humanity. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
27. Exploiting Database Management Systems and Treewidth for Counting.
- Author
-
FICHTE, JOHANNES K., HECHER, MARKUS, THIER, PATRICK, and WOLTRAN, STEFAN
- Subjects
DATABASES ,COUNTING ,DYNAMIC programming ,RELATION algebras ,ALGORITHMS ,SQL - Abstract
Bounded treewidth is one of the most cited combinatorial invariants in the literature. It was also applied for solving several counting problems efficiently. A canonical counting problem is #Sat, which asks to count the satisfying assignments of a Boolean formula. Recent work shows that benchmarking instances for #Sat often have reasonably small treewidth. This paper deals with counting problems for instances of small treewidth. We introduce a general framework to solve counting questions based on state-of-the-art database management systems (DBMSs). Our framework takes explicitly advantage of small treewidth by solving instances using dynamic programming (DP) on tree decompositions (TD). Therefore, we implement the concept of DP into a DBMS (PostgreSQL), since DP algorithms are already often given in terms of table manipulations in theory. This allows for elegant specifications of DP algorithms and the use of SQL to manipulate records and tables, which gives us a natural approach to bring DP algorithms into practice. To the best of our knowledge, we present the first approach to employ a DBMS for algorithms on TDs. A key advantage of our approach is that DBMSs naturally allow for dealing with huge tables with a limited amount of main memory (RAM). [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
28. Bringing it all together – Gemeinschaftlich aktiv lernen am virtuell geteilten Bildschirm in der Hochschule und digital.
- Author
-
Kaufmann, Jens, Hoseini, Sayed, Quindeau, Pascal, Quix, Christoph, and Ruschin, Sylvia
- Abstract
Copyright of HMD: Praxis der Wirtschaftsinformatik is the property of Springer Nature and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2021
- Full Text
- View/download PDF
29. OurRocks: Offloading Disk Scan Directly to GPU in Write-Optimized Database System.
- Author
-
Choi, Won Gi, Kim, Doyoung, Roh, Hongchan, and Park, Sanghyun
- Subjects
DATABASES ,GRAPHICS processing units ,OPTICAL disks ,HIGH performance computing ,CENTRAL processing units ,COMPUTER storage devices - Abstract
The log structured merge (LSM) tree has been widely adopted by database systems owing to its superior write performance. However, LSM-tree based databases face vulnerabilities when processing analytical queries due to the read amplification caused by its architecture and the limited use of storage devices with high bandwidth. To flexibly handle transactional and analytical workloads, we proposed and implemented OurRocks taking full advantage of NVMe SSD and GPU devices, which improves scan performance. Although the NVMe SSD serves multi GB/s I/O rates, it is necessary to solve the data transfer overhead which limits the benefits of the GPU processing. The primary idea is to offload the scan operation to the GPU with filtering predicate pushdown and resolve the bottleneck from the data transfer between devices with direct memory access (DMA). OurRocks benefits from all the features of write-optimized database systems, in addition to accelerating the analytic queries using the aforementioned idea. Experimental results indicate that OurRocks effectively leverages resources of the NVMe SSD and GPU and significantly improves the execution of queries in the YCSB and TPC-H benchmarks, compared to the conventional write-optimized database. Our research demonstrates that the proposed approach can speed up the handling of the data-intensive workloads. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
30. Local Databases or Cloud Computing Services: Cybersecurity Issues at the NUST, Zimbabwe.
- Author
-
Mthwazi, Guidance
- Abstract
The ever-changing technological environment compels Information Technology (IT) personnel to constantly keep abreast with the Information Systems/Information Technology Strategy (IS/IT Strategy) dynamics, in order to minimise the technological practice and global environmental gap. This paper explores the technological practice and global environmental gap at the National University of Science and Technology (NUST) in Zimbabwe, using the cases of modern day Cloud Computing (CC) in comparison to IT infrastructure and local database systems used at NUST. The paper aims at accessing potential pros and cons of NUST adopting CC database systems. By assessment of the CC services, NUST's organisational strategy and evaluation of a detailed risk/benefit analysis associated with them, the paper assists organisations considering adoption or implementation of this platform. The study was qualitative in nature and made use of Key Informant interviews, document analyses and observations. While all of our Key Informants agreed that CC would be a more efficient and effective tool for all at NUST, major concerns were raised regarding Confidentiality, Integrity and Availability (CIA) of information. Concerning potential cons of CC, the study also revealed that CC services are periodically standardised hence the need to capacitate IT personnel and users regularly. Government policies were also shown to pose potential hindrances of CC as state universities are government's strategic institutions. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
31. Concerto: Dynamic Processor Scaling for Distributed Data Systems with Replication.
- Author
-
Lee, Jinsu and Lee, Eunji
- Subjects
DATA replication ,CONCERTO ,SERVER farms (Computer network management) ,MICROGRIDS ,DATABASES - Abstract
A surge of interest in data-intensive computing has led to a drastic increase in the demand for data centers. Given this growing popularity, data centers are becoming a primary contributor to the increased consumption of energy worldwide. To mitigate this problem, this paper revisits DVFS (Dynamic Voltage Frequency Scaling), a well-known technique to reduce the energy usage of processors, from the viewpoint of distributed systems. Distributed data systems typically adopt a replication facility to provide high availability and short latency. In this type of architecture, the replicas are maintained in an asynchronous manner, while the master synchronously operates via user requests. Based on this relaxation constraint of replica, we present a novel DVFS technique called Concerto, which intentionally scales down the frequency of processors operating for the replicas. This mechanism can achieve considerable energy savings without an increase in the user-perceived latency. We implemented Concerto on Redis 6.0.1, a commercial-level distributed key-value store, demonstrating that all associated performance issues were resolved. To prevent a delay in read queries assigned to the replicas, we offload the independent part of the read operation to the fast-running thread. We also empirically demonstrate that the decreased performance of the replica does not cause an increase of the replication lag because the inherent load unbalance between the master and replica hides the increased latency of the replica. Performance evaluations with micro and real-world benchmarks show that Redis saves 32% on average and up to 51% of energy with Concerto under various workloads, with minor performance losses in the replicas. Despite numerous studies of the energy saving in data centers, to the best of our best knowledge, Concerto is the first approach that considers clock-speed scaling at the aggregate level, exploiting heterogeneous performance constraints across data nodes. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
32. A Trusted Federated System to Share Granular Data Among Disparate Database Resources.
- Author
-
DeFranco, Joanna F., Ferraiolo, David F., Kuhn, Rick, and Roberts, Joshua
- Subjects
INFORMATION sharing ,DATABASES ,ACCESS control ,BIG data - Abstract
Sharing data between organizations is difficult due to different database management systems imposing different schemas as well as security and privacy concerns. We leverage two proven NIST technologies to address the problem: Next Generation Database Access Control and the data block matrix. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
33. RelaX: Interaktive Relationale Algebra in der Lehre.
- Author
-
Specht, Günther, Kessler, Johannes, Mayerl, Maximilian, and Tschuggnall, Michael
- Abstract
Copyright of Datenbank-Spektrum is the property of Springer Nature and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2021
- Full Text
- View/download PDF
34. A Comparative Study of Consistent Snapshot Algorithms for Main-Memory Database Systems.
- Author
-
Li, Liang, Wang, Guoren, Wu, Gang, Yuan, Ye, Chen, Lei, and Lian, Xiang
- Subjects
DATABASES ,ALGORITHMS ,BIG data ,COMPARATIVE studies ,COMPUTER science - Abstract
In-memory databases (IMDBs) are gaining increasing popularity in big data applications, where clients commit updates intensively. Specifically, it is necessary for IMDBs to have efficient snapshot performance to support certain special applications (e.g., consistent checkpoint, HTAP). Formally, the in-memory consistent snapshot problem refers to taking an in-memory consistent time-in-point snapshot with the constraints that 1) clients can read the latest data items and 2) any data item in the snapshot should not be overwritten. Various snapshot algorithms have been proposed in academia to trade off throughput and latency, but industrial IMDBs such as Redis adhere to the simple fork algorithm. To understand this phenomenon, we conduct comprehensive performance evaluations on mainstream snapshot algorithms. Surprisingly, we observe that the simple fork algorithm indeed outperforms the state-of-the-arts in update-intensive workload scenarios. On this basis, we identify the drawbacks of existing research and propose two lightweight improvements. Extensive evaluations on synthetic data and Redis show that our lightweight improvements yield better performance than fork, the current industrial standard, and the representative snapshot algorithms from academia. Finally, we have opensourced the implementation of all the above snapshot algorithms so that practitioners are able to benchmark the performance of each algorithm and select proper methods for different application scenarios. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
35. Security and Privacy Implications on Database Systems in Big Data Era: A Survey.
- Author
-
Samaraweera, G. Dumindu and Chang, J. Morris
- Subjects
DATABASES ,BIG data ,DATA management ,SECURITY classification (Government documents) ,TELECOMMUNICATION ,RELATIONAL databases ,NONRELATIONAL databases - Abstract
For over many decades, relational database model has been considered as the leading model for data storage and management. However, as the Big Data explosion has generated a large volume of data, alternative models like NoSQL and NewSQL have emerged. With the advancement of communication technology, these database systems have given the potential to change the existing architecture from centralized mechanism to distributed in nature, to deploy as cloud-based solutions. Though all of these evolving technologies mostly focus on performance guarantees, it is still being a major concern how these systems can ensure the security and privacy of the information they handle. Different datastores support different types of integrated security mechanisms, however, most of the non-relational database systems have overlooked the security requirements of modern Big Data applications. This paper reviews security implementations in today's leading database models giving more emphasis on security and privacy attributes. A set of standard security mechanisms have been identified and evaluated based on different security classifications. Further, it provides a thorough review and a comprehensive analysis on maturity of security and privacy implementations in these database models along with future directions/enhancements so that data owners can decide on most appropriate datastore for their data-driven Big Data applications. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
36. Softar, software educativo de álgebra relacional de bases de datos.
- Author
-
Figueredo León, Ángel Enrique, Rodríguez Sánchez, Edel Ángel, and Silva Pérez, Yanet Esmeralda
- Abstract
Copyright of Roca: Revista Científico-Educacional de la Provincia de Granma is the property of Universidad de Granma, Departamento Editorial and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2021
37. Self-driving database systems: a conceptual approach.
- Author
-
Kossmann, Jan and Schlosser, Rainer
- Subjects
DATABASES ,DATABASE design ,ALGORITHMS ,LINEAR programming ,SCALABILITY - Abstract
Challenges for self-driving database systems, which tune their physical design and configuration autonomously, are manifold: Such systems have to anticipate future workloads, find robust configurations efficiently, and incorporate knowledge gained by previous actions into later decisions. We present a component-based framework for self-driving database systems that enables database integration and development of self-managing functionality with low overhead by relying on separation of concerns. By keeping the components of the framework reusable and exchangeable, experiments are simplified, which promotes further research in that area. Moreover, to optimize multiple mutually dependent features, e.g., index selection and compression configurations, we propose a linear programming (LP) based algorithm to derive an efficient tuning order automatically. Afterwards, we demonstrate the applicability and scalability of our approach with reproducible examples. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
38. Dynamic Exclusion Zones for Protecting Primary Users in Database-Driven Spectrum Sharing.
- Author
-
Bhattarai, Sudeep, Park, Jung-Min, and Lehr, William
- Subjects
RADIO interference ,ZONING ,SYNTHETIC apertures ,SPECTRUM allocation ,TELECOMMUNICATION systems - Abstract
In spectrum sharing, a spatial separation region is defined around a primary user (PU) where co-channel and/or adjacent channel secondary users (SUs) are not allowed to operate. This region is often called an Exclusion Zone (EZ), and it protects the PU from harmful interference caused by SUs. Unfortunately, existing methods for defining an EZ prescribe a static and an overly conservative boundary, which often leads to poor spectrum utilization efficiency. In this paper, we propose a novel framework—namely, Multi-tiered dynamic Incumbent Protection Zones (MIPZ)—for prescribing interference protection for PUs. MIPZ can be used to dynamically adjust the PU’s protection boundary based on the changing radio interference environment. MIPZ can also serve as an analytical tool for quantitatively analyzing a given protection region to gain insights on and determine the trade-off between interference protection and spectrum utilization efficiency. Using results from extensive simulations and a real-world case study, we demonstrate the effectiveness of MIPZ in protecting PUs from harmful interference and in improving the overall spectrum utilization efficiency. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
39. 人工智能赋能的查询处理与优化新技术研究综述.
- Author
-
宋雨萌, 谷峪, 李芳芳, and 于戈
- Abstract
Copyright of Journal of Frontiers of Computer Science & Technology is the property of Beijing Journal of Computer Engineering & Applications Journal Co Ltd. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2020
- Full Text
- View/download PDF
40. The Disruptions of 5G on Data-Driven Technologies and Applications.
- Author
-
Loghin, Dumitrel, Cai, Shaofeng, Chen, Gang, Dinh, Tien Tuan Anh, Fan, Feiyi, Lin, Qian, Ng, Janice, Ooi, Beng Chin, Sun, Xutao, Ta, Quang-Trung, Wang, Wei, Xiao, Xiaokui, Yang, Yang, Zhang, Meihui, and Zhang, Zhonghua
- Subjects
SMART cities ,TECHNOLOGICAL innovations ,TECHNOLOGY ,DATA management ,DATABASES ,LANDSCAPE assessment ,LANDSCAPE design - Abstract
With 5G on the verge of being adopted as the next mobile network, there is a need to analyze its impact on the landscape of computing and data management. In this paper, we analyze the impact of 5G on both traditional and emerging technologies and project our view on future research challenges and opportunities. With a predicted increase of 10-100x in bandwidth and 5-10x decrease in latency, 5G is expected to be the main enabler for smart cities, smart IoT and efficient healthcare, where machine learning is conducted at the edge. In this context, we investigate how 5G can help the development of federated learning. Network slicing, another key feature of 5G, allows running multiple isolated networks on the same physical infrastructure. However, security remains the main concern in the context of virtualization, multi-tenancy and high device density. Formal verification of 5G networks can be applied to detect security issues in massive virtualized environments. In summary, 5G will make the world even more densely and closely connected. What we have experienced in 4G connectivity will pale in comparison to the vast amounts of possibilities engendered by 5G. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
41. Make the most out of your SIMD investments: counter control flow divergence in compiled query pipelines.
- Author
-
Lang, Harald, Passing, Linnea, Kipf, Andreas, Boncz, Peter, Neumann, Thomas, and Kemper, Alfons
- Abstract
Increasing single instruction multiple data (SIMD) capabilities in modern hardware allows for the compilation of data-parallel query pipelines. This means GPU-alike challenges arise: control flow divergence causes the underutilization of vector-processing units. In this paper, we present efficient algorithms for the AVX-512 architecture to address this issue. These algorithms allow for the fine-grained assignment of new tuples to idle SIMD lanes. Furthermore, we present strategies for their integration with compiled query pipelines so that tuples are never evicted from registers. We evaluate our approach with three query types: (i) a table scan query based on TPC-H Query 1, that performs up to 34% faster when addressing underutilization, (ii) a hashjoin query, where we observe up to 25% higher performance, and (iii) an approximate geospatial join query, which shows performance improvements of up to 30%. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
42. WAL-SSD: Address Remapping-Based Write-Ahead-Logging Solid-State Disks.
- Author
-
Han, Kyuhwa, Kim, Hyukjoong, and Shin, Dongkun
- Subjects
SOLID state drives ,FLASH memory ,HARD disks ,RANDOM access memory ,ATOMIC layer deposition - Abstract
Recent advances in flash memory technology have reduced the cost-per-bit of flash storage devices such as solid-state drives (SSDs), thereby enabling the development of large-capacity SSDs for enterprise-scale storage. However, two major concerns arise in designing SSDs. First, the size of the address mapping table is increasing in proportion to the capacity of the SSD. The SSD-internal firmware, called flash translation layer (FTL), must maintain the address mapping table in the internal DRAM. Although the previously proposed demand map loading technique uses a small size of cached map table, the technique aggravates poor random performance. Second, there are many redundant writes in storage workloads, which have an adverse effect on the performance and lifetime of the SSD. For example, many transaction-supporting applications use the write-ahead-log (WAL) scheme, which writes the same data twice. To resolve these problems, we propose a novel transaction-supporting SSD, called WAL-SSD, which logs transaction data at the internally-managed WAL area and relocates the data atomically via the FTL-level remap operation at the transaction checkpointing. It can also be used to transform random write requests to sequential requests. We implemented a prototype of WAL-SSD with a real SSD device. Experiments demonstrate the performance improvement by WAL-SSD with three use cases: remap-journaling, atomic multi-block update, and random write logging. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
43. Design and Implementation of SSD-Assisted Backup and Recovery for Database Systems.
- Author
-
Son, Yongseok, Kim, Moonsub, Kim, Sunggon, Yeom, Heon Young, Kim, Nam Sung, and Han, Hyuck
- Subjects
SOLID state drives ,DATABASES ,RAID (Computer science) ,SOFTWARE failures - Abstract
As flash-based solid-state drive (SSD) becomes more prevalent because of the rapid fall in price and the significant increase in capacity, customers expect better data services than traditional disk-based systems. However, the order of magnitude performance provided and new characteristics of flash require a rethinking of data services. For example, backup and recovery is an important service in a database system since it protects data against unexpected hardware and software failures. To provide backup and recovery, backup/recovery tools or backup/recovery methods by operating systems can be used. However, the tools perform time-consuming jobs, and the methods may negatively affect run-time performance during normal operation even though high-performance SSDs are used. To handle these issues, we propose an SSD-assisted backup/recovery scheme for database systems. Our scheme is to utilize the characteristics (e.g., out-of-place update) of flash-based SSD for backup/recovery operations. To this end, we exploit the resources (e.g., flash translation layer and DRAM cache with supercapacitors) inside SSD, and we call our SSD with new backup/recovery functionality BR-SSD. We design and implement the functionality in the Samsung enterprise-class SSD (i.e., SM843Tn) for more realistic systems. Furthermore, we exploit and integrate BR-SSDs into database systems (i.e., MySQL) in replication and redundant array of independent disks (RAID) environments, as well as a database system in a single BR-SSD. The experimental result demonstrates that our scheme provides fast backup and recovery but does not negatively affect the run-time performance during normal operation. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
44. Prescriptive analytics: a survey of emerging trends and technologies.
- Author
-
Frazzetto, Davide, Nielsen, Thomas Dyhre, Pedersen, Torben Bach, and Šikšnys, Laurynas
- Abstract
This paper provides a survey of the state-of-the-art and future directions of one of the most important emerging technologies within business analytics (BA), namely prescriptive analytics (PSA). BA focuses on data-driven decision-making and consists of three phases: descriptive, predictive, and prescriptive analytics. While descriptive and predictive analytics allow us to analyze past and predict future events, respectively, these activities do not provide any direct support for decision-making. Here, PSA fills the gap between data and decisions. We have observed an increasing interest for in-DBMS PSA systems in both research and industry. Thus, this paper aims to provide a foundation for PSA as a separate field of study. To do this, we first describe the different phases of BA. We then survey classical analytics systems and identify their main limitations for supporting PSA, based on which we introduce the criteria and methodology used in our analysis. We next survey, categorize, and discuss the state-of-the-art within emerging, so-called PSA + , systems, followed by a presentation of the main challenges and opportunities for next-generation PSA systems. Finally, the main findings are discussed and directions for future research are outlined. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
45. Scalable Linear Algebra on a Relational Database System.
- Author
-
Luo, Shangyu, Gao, Zekai J., Gubanov, Michael, Perez, Luis L., and Jermaine, Christopher
- Subjects
DISTRIBUTED databases ,LINEAR algebra ,DATABASE management ,RELATIONAL databases ,ABSTRACT data types (Computer science) - Abstract
As data analytics has become an important application for modern data management systems, a new category of data management system has appeared recently: the scalable linear algebra system. In this paper, we argue that a parallel or distributed database system is actually an excellent platform upon which to build such functionality. Most relational systems already have support for cost-based optimization—which is vital to scaling linear algebra computations—and it is well-known how to make relational systems scale. We show that by making just a few changes to a parallel/distributed relational database system, such a system can be a competitive platform for scalable linear algebra. Taken together, our results should at least raise the possibility that brand new systems designed from the ground up to support scalable linear algebra are not absolutely necessary, and that such systems could instead be built on top of existing relational technology. Our results also suggest that if scalable linear algebra is to be added to a modern dataflow platform such as Spark, they should be added on top of the system's more structured (relational) data abstractions, rather than being constructed directly on top of the system's raw dataflow operators. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
46. GreenDB: Energy-Efficient Prefetching and Caching in Database Clusters.
- Author
-
Zhou, Yi, Taneja, Shubbhi, Zhang, Chaowei, and Qin, Xiao
- Subjects
ENERGY consumption ,DATABASES ,MATHEMATICAL models ,ENERGY conservation ,DATA libraries - Abstract
In this study, we propose an energy-efficient database system called GreenDB running on clusters. GreenDB applies a workload-skewness strategy by managing hot nodes coupled with a set of cold nodes in a database cluster. GreenDB fetches popular data tables to hot nodes, aiming to keep cold nodes in the low-power mode in increased time periods. GreenDB is conducive to reducing the number of power-state transitions, thereby lowering energy-saving overhead. A prefetching model and an energy saving model are seamlessly integrated into GreenDB to facilitate the power management in database clusters. We quantitatively evaluate GreenDB's energy efficiency in terms of managing, fetching, and storing data. We compare GreenDB's prefetching strategy with the one implemented in Postgresql. Experimental results indicate that GreenDB conserves the energy consumption of the existing solution by up to 98.4 percent. The findings show that the energy efficiency of GreenDB can be optimized by tuning system parameters, including table size, hit rates, number of nodes, number of disks, and inter-arrival delays. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
47. Verification and validation techniques for streaming big data analytics in internet of things environment.
- Author
-
Kumari, Aparna, Tanwar, Sudeep, Tyagi, Sudhanshu, and Kumar, Neeraj
- Abstract
With an exponential growth of raw data generated from different sensors, actuators, and mobile devices, data analytics is becoming a challenging issue keeping in view of the heterogeneity of the data generated. The traditional database systems are not able to handle this huge amount of data. Current research and development are mainly focused on the analytics of this big data generated from different devices and overlook the difficulty of its secure streaming. In this study, the authors analyse and provide an overview on how to handle secure streaming data generated from different devices. It includes the major threats and risks, introduced during big data processing in Internet of things environment. Moreover, they analyse the existing security approaches and highlight emerging challenges required to process the heterogeneous big data in a secure manner for IoT applications, such as denial of service, malware, and phishing. The architectural details and security approaches required in each phase of big data processing life‐cycle are explored in detail. Finally, various research challenges along with a case study are discussed and analysed. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
48. Making big data small.
- Author
-
Wenfei Fan
- Subjects
BIG data ,PARALLEL processing ,DATABASES ,SQL - Abstract
Big data analytics is often prohibitively costly and is typically conducted by parallel processing with a cluster of machines. Is big data analytics beyond the reach of small companies that can only afford limited resources? This paper tackles this question by presenting Boundedly EvAlable SQL (BEAS), a system for querying big relations with constrained resources. The idea is to make big data small. To answer a query posed on a dataset, it often suffices to access a small fraction of the data no matter how big the dataset is. In the light of this, BEAS answers queries on big data by identifying and fetching a small set of the data needed. Under available resources, it computes exact answers whenever possible and otherwise approximate answers with accuracy guarantees. Underlying BEAS are principled approaches of bounded evaluation and data-driven approximation, the focus of this paper. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
49. Capturing More Meaning in Databases.
- Author
-
Maier, David
- Subjects
DATABASE management ,ELECTRONIC information resources ,SEMANTICS ,PROGRAMMING languages ,INFORMATION theory ,INFORMATION resources management - Abstract
Users must understand the meaning of a database before they can intelligently interpret the data in it. We explore the problem of making databases self-descriptive. We first examine the levels of semantics of data by analogy to the area of programming languages. Next, we consider several semantic data models, which are formal languages for conceptual data description, and how a semantic description can be linked to the objects in a database. We discuss the advantages of making a semantic description part of a database itself and the types of application programs that would use that description. In particular, we focus on PIQUE, a high-level relational query language that uses such semantic information. [ABSTRACT FROM AUTHOR]
- Published
- 1984
- Full Text
- View/download PDF
50. Design and Implementation of Secure Management System for Electronic Library.
- Author
-
Hatem, Haider S. and Mohammed, Ghada Salim
- Subjects
DIGITAL library security measures ,ACADEMIC library security ,WEB design ,LIBRARY security ,DIGITAL libraries ,DATA protection - Abstract
Copyright of Journal of Madenat Al-Elem University College / Magallaẗ Kulliyyaẗ Madīnaẗ Al-ʿAlam Al-Ğāmi'aẗ is the property of Republic of Iraq Ministry of Higher Education & Scientific Research (MOHESR) and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2019
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.