You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
RCAEval is an open-source framework for multimodal root cause analysis (RCA). It provides 20+ ready-to-use RCA tools including metric-based, trace-based, and multi-source RCA approaches. It is accompanied by comprehensive datasets containing 735 failure cases from real-world software systems.
26
-
RCAEval is a Python framework, open-source on GitHub, and installable via PyPi. The datasets are available in Zenodo which can be programmatically downloaded. RCAEval provides many reproducible RCA tools and multimodal benchmark datasets, enabling researchers to evaluate RCA methods under realistic conditions.
25
+
RCAEval is an open-source framework for multimodal root cause analysis (RCA). It provides 20+ ready-to-use RCA methods including single-modality and multi-modality approaches. It is accompanied by comprehensive datasets containing 735 failure cases from real-world systems.
26
+
RCAEval is a Python framework, open-source on GitHub, and installable via PyPi. The datasets are available in Zenodo which can be programmatically downloaded. RCAEval provides tools and datasets, enabling researchers to evaluate RCA methods under realistic conditions.
27
27
28
28
# Statement of need
29
29
30
-
Modern large and dynamic systems generate massive amounts of observability data including metrics, logs, and traces. When failures occur, they can propagate across multiple components, making it challenging for operators to identify root causes from the overwhelming volume of observable data. Root cause analysis (RCA) aims to pinpoint the faulty component and the specific indicators responsible for the failure.
30
+
Modern large and dynamic systems generate massive amounts of observability data including time series metrics, textual logs, and topological/graphical tracing. When failures occur, they can propagate across multiple components, making it challenging for operators to identify root causes from the overwhelming volume of observable data. Root cause analysis (RCA) aims to pinpoint the faulty component and the specific indicators responsible for the failure.
31
31
32
32
Despite growing research interest in automated RCA, the field lacks a standardized, reproducible benchmark. Existing studies typically evaluate on limited systems with few fault types, often using private datasets that prevent fair comparison. Available resources provide only single-modality analysis (e.g., metrics only) without support for multimodal data combining logs and traces. Commercial observability platforms offer RCA capabilities but are proprietary and not reproducible for research purposes.
33
33
34
-
RCAEval fills this gap by providing: (1) three large-scale datasets with 735 failure cases across three software systems, covering resource faults (CPU, memory, disk), network faults (delay, packet loss), and code-level faults; (2) multimodal telemetry data including metrics, logs, and traces; (3) 15 reproducible RCA tools implementations including state-of-the-art methods, e.g., BARO [@pham2024baro] and CausalRCA [@Xin2023CausalRCA]; and (4) standardized evaluation metrics for consistent comparison.
35
-
36
-
RCAEval targets researchers developing new RCA algorithms, practitioners evaluating methods for production deployment, and educators teaching algorithms.
34
+
RCAEval provides a unified framework with 20+ reproducible RCA method implementations including several state-of-the-art methods, e.g., BARO [@pham2024baro], CausalRCA [@Xin2023CausalRCA], a standard evaluation pipeline, and comprehensive datasets with 735 failure cases. The benchmark datasets were published in my prior works [@pham2024baro;@pham2024root;@pham2025rcaeval] on microservice systems. This paper focuses on describing RCAEval as a framework for the general RCA task, which is extendable and applicable to various domains. RCAEval targets researchers developing new RCA algorithms, practitioners evaluating methods for production deployment, and educators teaching algorithms.
37
35
38
36
# State of the field
39
37
40
38
Several tools and datasets exist for RCA, but none provide comprehensive coverage of multimodal telemetry with reproducible methods. Existing libraries focus on metric-based RCA, supporting methods like Bayesian networks and Granger causality, but lack support for log and trace analysis. Available datasets provide limited fault types and no benchmarking framework for systematic evaluation. Commercial observability platforms offer automated root cause analysis features, but their proprietary nature prevents reproducible research comparisons.
41
39
42
-
RCAEval distinguishes itself by providing the first open-source benchmark framework that combines 20+ reproducible RCA tools with large-scale datasets with multimodal observability data. This enables fair, systematic comparison of RCA methods under realistic failure scenarios.
40
+
# Overview
41
+
42
+
RCAEval distinguishes itself by providing the first open-source benchmark framework that combines 20+ reproducible RCA tools with large-scale multimodal failure datasets. This enables fair, systematic comparison of RCA methods under realistic failure scenarios. The Figure 1 gives an overview on the RCAEval framework.
43
43
44
44
{ width=100% }
The datasets are available in the `data` directory after download. Details about the datasets are available in the Zenodo repository (DOI: 10.5281/zenodo.14590730).
86
+
The datasets are available in the `data` directory after downloading. Details about the datasets are available in the Zenodo repository (DOI: 10.5281/zenodo.14590730).
87
87
88
88
# Acknowledgements
89
89
90
90
We would like to express our sincere gratitude to the researchers and developers who created the baselines used in our library. Their work has been instrumental in making this project possible. We deeply appreciate the time, effort, and expertise that have gone into developing and maintaining these resources. This project would not have been feasible without their contributions.
91
91
92
-
This framework is built upon my previously published academic work [@pham2024baro;@pham2024root;@pham2025rcaeval] on RCA for microservice systems. As I am working toward general root cause analysis, this software paper positions the framework for general RCA task without limiting itself to microservice systems. Future improvement of this framework focuses on the inclusion of RCA methods and datasets from different fields.
92
+
This framework is built upon my previously published academic work [@pham2024baro;@pham2024root;@pham2025rcaeval] on RCA for microservice systems. As I am working toward general root cause analysis, this software paper positions the framework for the general RCA task without limiting itself to microservice systems. Future improvements to this framework will focus on the inclusion of RCA methods and datasets from different fields.
0 commit comments