What is it about?

Failures are first-class citizens in modern microservice-/service-based applications. We here survey the existing techniques to detect them based on their symptoms (viz., anomalies), and to determine their possible root causes. The survey is based on 46 existing studies, which describe how to process services' monitored KPIs, logs, or distributed traces to enact anomaly detection and/or root cause analysis.

Featured Image

Why is it important?

Our survey can provide benefits to both practitioners and researchers working with modern multi-service applications. We indeed not only help them in finding the anomaly detection and root cause analysis techniques most suited to their needs, but we also discuss some open challenges and possible research directions on the topic.

Read the Original

This page is a summary of: Anomaly Detection and Failure Root Cause Analysis in (Micro) Service-Based Cloud Applications: A Survey, ACM Computing Surveys, April 2023, ACM (Association for Computing Machinery),
DOI: 10.1145/3501297.
You can read the full text:

Read

Contributors

The following have contributed to this page