What is it about?

Cloud-native applications increasingly adopt the microservices architecture, which favors elasticity to satisfy the application performance requirements in face of variable workloads. To simplify the elasticity management, the trend is to create an auto-scaler instance per microservice, which controls its horizontal scalability by using the classic threshold-based policy. Although easy to implement, setting manually the scaling thresholds, which are usually statically-defined on a single metric, may lead to poor scaling decisions when applications are heterogeneous in terms of resource consumption. In this paper, we study dynamic multi-metric threshold-based scaling policies, that exploit Reinforcement Learning (RL) to autonomously update the scaling thresholds, one per controlled resource (CPU and memory). The proposed RL approaches (i.e., QL, MB, and DQL Threshold) use different degrees of knowledge about the system dynamics. To model the thresholds adaptation actions, we consider two RL-based architectures. In the single-agent architecture, one agent drives the updates of both scaling thresholds. To speed-up the learning, the multi-agent architecture adopts a distinct agent per threshold. Simulation- and prototype-based results show the benefits of the proposed solutions when compared to the state-of-the-art policies and highlight the advantages of multi-agent MB Threshold and DQL Threshold approaches, in terms of deployment objectives and execution times

Featured Image

Why is it important?

Novel RL-based techniques for scaling microservices while taking into account multiple resource utilization (i.e., CPU and memory)

Read the Original

This page is a summary of: Dynamic Multi-metric Thresholds for Scaling Applications Using Reinforcement Learning, IEEE Transactions on Cloud Computing, January 2022, Institute of Electrical & Electronics Engineers (IEEE),
DOI: 10.1109/tcc.2022.3163357.
You can read the full text:

Read

Resources

Contributors

The following have contributed to this page