1. Data and Conceptual Model Synchronization in Data-Intensive Domains: The Human Genome Case
- Author
-
Floris Emanuel, Oscar Pastor, and Verónica Burriel
- Subjects
Structure (mathematical logic) ,Software development process ,Computer science ,media_common.quotation_subject ,Synchronization (computer science) ,Conceptual model ,Information system ,Context (language use) ,Data science ,Conceptual schema ,Domain (software engineering) ,media_common - Abstract
Context and Motivation: With the increasing quantity and versatility of data in data-intensive domains, designing information systems, to effectively process the relevant information is becoming increasingly challenging. Conceptual modeling could tackle such challenges in numerous manners as a preliminary phase in the software development process. But assessing data and model synchronization becomes an issue in domains where data are heterogeneous, have a diverse provenance and are subject to continuous change. Question/problem: The problem is how to determine and demonstrate the ability of a conceptual schema to represent the concepts and the data in the particular data-intensive domain. Principal Ideas/Results: A validation approach has been designed for the Conceptual Schema of the Human Genome by investigating the particular issues in the genetic domain and systematically connecting constituents of this conceptual schema with potential instances in samples of genome-related data. As a result, this approach provided us accurate insight in terms of attribute resemblance, completeness, structure and shortcomings. Contribution: This work demonstrates how the strategy of conceptualizing a data-intensive domain and then validating that concept by reconnecting this with the attributes of the real world data domain, can be generalized. Conceptual modeling has a limited resistance to the evolution of data, which is the next problem to face.
- Published
- 2021
- Full Text
- View/download PDF