Sorry, I don't understand your search. ×
Back to Search Start Over

Automating Data Extraction from Semi-Structured Industrial Documents : The Alstom Experience

Authors :
Möller, T.
Singh, I.
Bucaioni, Alessio
Cicchetti, Antonio
Möller, T.
Singh, I.
Bucaioni, Alessio
Cicchetti, Antonio
Publication Year :
2024

Abstract

In the system development of modern railroad vehicles, engineers frequently use a plethora of diverse notations to specify various systems, subsystems, and their associated concerns. The use of diverse notations introduces complex challenges linked with their management and integration. Conventional practices, which rely on manual revisions and translations, prove to be both time-intensive and cost-prohibitive. In addition, they carry substantial risks of human error, thereby potentially introducing faults into the system. Such practices are deemed inadequate for the railway industry, which is safety-critical in its nature and places paramount importance on the assurance of reliability and data integrity. To address these challenges, we developed a regular expression-based system facilitating the automatic translation of semi-structured texts into structured data, with a particular focus on ensuring data integrity and reliability. We have defined the system capitalizing on the insights and practical experience of our industrial partner, Alstom Rail Sweden AB, and validated it within their development process. The validation demonstrated the practicality of the system in a real-world context and highlighted valuable lessons learned throughout the process. Building on these insights, we applied model-driven engineering principles to generalize the system, providing an automated solution to the data extraction challenge from tender documents in the railway domain.

Details

Database :
OAIster
Notes :
English
Publication Type :
Electronic Resource
Accession number :
edsoai.on1482296583
Document Type :
Electronic Resource
Full Text :
https://doi.org/10.1109.ETFA61755.2024.10711023