Back to Search Start Over

The E(G)TL Model: A Novel Approach for Efficient Data Handling and Extraction in Multivariate Systems

Authors :
Aleksejs Vesjolijs
Source :
Applied System Innovation, Vol 7, Iss 5, p 92 (2024)
Publication Year :
2024
Publisher :
MDPI AG, 2024.

Abstract

This paper introduces the EGTL (extract, generate, transfer, load) model, a theoretical framework designed to enhance the traditional ETL processes by integrating a novel ‘generate’ step utilizing generative artificial intelligence (GenAI). This enhancement optimizes data extraction and processing, presenting a high-level solution architecture that includes innovative data storage concepts: the Fusion and Alliance stores. The Fusion store acts as a virtual space for immediate data cleaning and profiling post-extraction, facilitated by GenAI, while the Alliance store serves as a collaborative data warehouse for both business users and AI processes. EGTL was developed to facilitate advanced data handling and integration within digital ecosystems. This study defines the EGTL solution design, setting the groundwork for future practical implementations and exploring the integration of best practices from data engineering, including DataOps principles and data mesh architecture. This research underscores how EGTL can improve the data engineering pipeline, illustrating the interactions between its components. The EGTL model was tested in the prototype web-based Hyperloop Decision-Making Ecosystem with tasks ranging from data extraction to code generation. Experiments demonstrated an overall success rate of 93% across five difficulty levels. Additionally, the study highlights key risks associated with EGTL implementation and offers comprehensive mitigation strategies.

Details

Language :
English
ISSN :
25715577
Volume :
7
Issue :
5
Database :
Directory of Open Access Journals
Journal :
Applied System Innovation
Publication Type :
Academic Journal
Accession number :
edsdoj.78d2f95f7fb45f48279d4e9b17a04b0
Document Type :
article
Full Text :
https://doi.org/10.3390/asi7050092