What is it about?

Modern datacenters rely on many network paths between servers to keep cloud services fast and reliable. A software switch decides which path each network connection should use. This operation, called traffic splitting, sounds simple: if one path is given more weight than another, it should receive a larger share of the connections. In practice, however, this paper shows that traffic splitting in software switches is not as accurate or as lightweight as many systems assume. Existing techniques can send too many connections to some paths and too few to others, creating avoidable congestion. Some methods improve accuracy, but only by using more CPU cycles, memory, or per-packet processing time inside the switch. We introduce VALO, a new traffic splitting technique for software switches. VALO is built on two ideas. First, a score graph models how existing score-based traffic splitting actually distributes connections. Second, VALO gravity adjusts the scoring process so that the final connection distribution better follows the intended path weights. VALO is implemented in Open vSwitch and evaluated with real-world traces and datacenter workloads, including web search, data mining, deep learning, and in-memory cache services. The main message is that traffic splitting should not be treated as a solved low-level detail. Its accuracy and overhead directly affect datacenter application performance, and improving it can make multipath networking faster and more efficient.

Featured Image

Why is it important?

Many datacenter networking studies and systems focus on how to choose path weights based on congestion, failures, or link capacity. This paper highlights a missing layer in that design: even if the weights are well chosen, the software switch must still realize those weights accurately and efficiently. When traffic splitting is inaccurate, some paths become overloaded while others are underused. This can increase flow completion time for real workloads. When traffic splitting is resource-heavy, the switch spends more CPU cycles and time just deciding where packets should go. Both problems become more serious as datacenters run communication-intensive services such as AI training, analytics, and large-scale cloud applications. VALO addresses both issues at the same time. In the paper’s evaluation, VALO achieves much higher traffic splitting accuracy and much better resource efficiency than existing techniques. It also reduces communication completion time for real datacenter workloads. These results suggest that future work on datacenter load balancing, multipath routing, software switches, and cloud networking should consider not only how path weights are computed, but also how accurately and efficiently those weights are enforced. Key takeaways: 1. Traffic splitting in software switches is a critical but often overlooked part of datacenter networking. 2. Existing traffic splitting techniques can be inaccurate, resource-inefficient, or both. 3. VALO uses score graph modeling and VALO gravity to match traffic distribution more closely to intended path weights. 4. VALO is implemented in Open vSwitch and evaluated with real traces and datacenter workloads. 5. Researchers building on multipath routing or datacenter load balancing may need to account for traffic splitting accuracy, rather than assuming it is ideal.

Perspectives

For a long time, much of the discussion around datacenter multipath networking has focused on deciding the right path weights: how to react to congestion, how to avoid failures, and how to balance traffic across available links. This work came from questioning a more basic assumption: once the weights are given, does the software switch actually split traffic according to them? Our results show that this assumption does not always hold. The traffic splitting mechanism itself can introduce substantial error and overhead. That means a system may appear well designed at the control-plane level, while still losing performance at the switch-level decision point. VALO is our attempt to make this lower-level mechanism both more accurate and practical. We hope the work encourages researchers and practitioners to revisit traffic splitting as a first-class design problem in datacenter networking, especially when evaluating multipath routing, software switch performance, or cloud systems that depend on predictable network behavior.

Gyeongsik Yang
Korea University

Read the Original

This page is a summary of: Revisiting Traffic Splitting for Software Switch in Datacenter, Proceedings of the ACM on Measurement and Analysis of Computing Systems, May 2025, ACM (Association for Computing Machinery),
DOI: 10.1145/3727131.
You can read the full text:

Read

Contributors

The following have contributed to this page