Back to Search Start Over

Robust Failure Diagnosis of Microservice System through Multimodal Data

Authors :
Zhang, Shenglin
Jin, Pengxiang
Lin, Zihan
Sun, Yongqian
Zhang, Bicheng
Xia, Sibo
Li, Zhengdan
Zhong, Zhenyu
Ma, Minghua
Jin, Wa
Zhang, Dai
Zhu, Zhenyu
Pei, Dan
Publication Year :
2023

Abstract

Automatic failure diagnosis is crucial for large microservice systems. Currently, most failure diagnosis methods rely solely on single-modal data (i.e., using either metrics, logs, or traces). In this study, we conduct an empirical study using real-world failure cases to show that combining these sources of data (multimodal data) leads to a more accurate diagnosis. However, effectively representing these data and addressing imbalanced failures remain challenging. To tackle these issues, we propose DiagFusion, a robust failure diagnosis approach that uses multimodal data. It leverages embedding techniques and data augmentation to represent the multimodal data of service instances, combines deployment data and traces to build a dependency graph, and uses a graph neural network to localize the root cause instance and determine the failure type. Our evaluations using real-world datasets show that DiagFusion outperforms existing methods in terms of root cause instance localization (improving by 20.9% to 368%) and failure type determination (improving by 11.0% to 169%).

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2302.10512
Document Type :
Working Paper