1. Using Text Mining and Bayesian Network to Identify Key Risk Factors for Safety Accidents in Metro Construction.
- Author
-
Shen, Jianhong, Liu, Shupeng, and Zhang, Jing
- Subjects
TEXT mining ,BAYESIAN analysis ,SAFETY factor in engineering ,ACCIDENT prevention ,FEATURE extraction - Abstract
Complex risk factors make metro construction safety accidents prone to occur, and there are various types of accidents. Accident reports record detailed information about different types of accidents in text form. However, effectively utilizing such unstructured data presents a significant challenge. Text mining (TM) provides a viable foundation for addressing this challenge, but related studies have limitations in risk feature extraction and lack of in-depth analysis capability. To address the deficiencies of existing studies and provide a feasible strategy for identifying key risk factors in the metro construction domain, this paper proposes an integrated model combining TM and machine learning–based Bayesian networks. Firstly, the term frequency-inverse document frequency (TF-IDF) algorithm in TM was used to separately extract the direct and indirect cause factors from the accident reports, with the missing factors supplemented using the TextRank algorithm. Then, depending on the assumption of whether to consider the conditional independence between factors, an improved naive Bayesian network (NBN) and a tree-augmented naive Bayesian network (TAN) were built based on the extracted factors and the corresponding accident types, respectively, for further in-depth analysis. Finally, the training set was divided to train the two network models, and sensitivity analysis was used to identify the key risk factors. Using 162 accident reports from China as an application example, the results showed that TAN exhibited a higher average accuracy (79.62%) in the test set compared with the improved NBN (71.75%), and the importance of risk factors for different accident types was successfully ranked from multiple perspectives using TAN. Meanwhile, some new insights into metro accidents in China were obtained, which can support decision-making for accident prevention and control. In conclusion, this paper effectively addresses the relevant limitations of accident text utilization and presents a novel approach for metro construction safety management. Analyzing accident texts can help gain insights from objective historical data to support safety management efforts. However, accident texts are often unstructured and contain a lot of irrelevant content. How to quickly extract valid information from accident text and use it to analyze accidents in depth is of continuous interest to safety managers. In particular, those models that have real-time decision support capabilities in addition to theoretical insights. This paper proposes an integrated model that combines text mining and machine-learning Bayesian networks. This model achieves comprehensive textual feature extraction, multifaceted accident causation analysis, and allows safety managers to input current accident information into the model to obtain real-time decision support for accident prevention and control. Although the proposed model is developed for metro construction, it can be slightly adapted by incorporating the characteristics of accident texts from similar domains to obtain an integrated model suitable for these domains, so as to effectively control the occurrence of safety accidents. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF