What is it about?

Modern software systems constantly measure what is happening inside them: how much CPU they use, how many requests they handle, how full their caches are, and more. These measurements are stored as time series – long sequences of numbers over time – and are essential for keeping systems fast, reliable, and cost‑efficient. However, this monitoring data grows very quickly, so observability platforms such as Dynatrace must compress it as it is ingested to keep storage costs and query times under control. Many existing compression methods for numeric data were developed and tested on domains such as finance or weather, where values change frequently. In our work, we show that real application monitoring data behaves very differently: large portions of it are highly repetitive, with many values staying constant for long periods and entire blocks of records being exactly the same. When algorithms designed for constantly fluctuating data are applied to this more repetitive monitoring data, they do not work as well as they could. In this paper, we analyze a large real‑world monitoring dataset from Dynatrace and quantify how it differs from standard benchmarks. Based on these findings, we improve an existing floating‑point compression algorithm (Gorilla) and introduce a multi‑layer strategy that first compresses long runs of identical records, then removes redundant information inside each record, and only finally applies a specialized floating‑point compressor. This approach reduces storage by about a quarter and speeds up compression and decompression significantly compared to using a single algorithm alone. We further develop a lightweight heuristic that examines each block of data and automatically selects the most suitable compressor for it, achieving compression very close to the best possible choice while adding almost no extra overhead. Overall, our results show that understanding the specific patterns of application monitoring data and adapting compression to them leads to more efficient and practical observability systems.

Featured Image

Why is it important?

Our work is unique because it is based on a large real‑world dataset from a production observability platform, rather than on synthetic or academic benchmarks. We show that popular floating‑point compression methods, designed and evaluated on domains like finance or weather, do not match the highly repetitive nature of application monitoring data and therefore leave efficiency on the table. It is timely because monitoring and telemetry volumes in cloud‑native systems are growing fast, and observability cost and performance are becoming critical concerns. Even modest gains in compression can significantly reduce storage and processing overhead. The difference our work can make is to shift compression for monitoring data from generic, one‑size‑fits‑all methods to domain‑aware and adaptive strategies. We provide a practical multi‑layer compression design and a lightweight selection heuristic that can be integrated into real systems, helping practitioners build more scalable and cost‑effective observability platforms.

Perspectives

This publication is personally meaningful to me because it grew directly out of my master’s thesis, which focused on real issues in a running observability platform. Implementing a practical solution turned out to be very different from doing a “plain” research project: we had to test thoroughly on real production data and tailor the design to an actual use case, not just to benchmarks. This experience strengthened my belief that systems research should be grounded in real workloads and constraints. For me, this paper captures that lesson: robust, domain‑aware solutions matter more than neat ideas that only work in theory.

Konstantin Urbanides
Johannes Kepler Universitat Linz

Read the Original

This page is a summary of: Analyzing and Improving Floating-Point Compression for Application Monitoring Time Series, May 2026, ACM (Association for Computing Machinery),
DOI: 10.1145/3788853.3803097.
You can read the full text:

Read

Contributors

The following have contributed to this page