What is it about?

This research studies how to make deep learning models run faster on resource-constrained edge devices, such as IoT devices and mobile systems. Instead of running the entire neural network on the device or sending everything to the cloud, Split Learning divides the model into two parts: one runs on the device and the other runs on the server. However, deciding where to split the model is challenging. A poor split can increase latency due to excessive communication or computation. In this paper, we propose a Genetic Algorithm (GA)-based optimization method that automatically selects the best split point to minimize end-to-end inference latency. The algorithm evaluates possible split positions based on measured computation time and communication delay. Unlike heuristic or brute-force methods, our approach performs a global search to find better solutions efficiently. Experiments using large-scale models such as EfficientNet-B7 show that our method significantly improves the effectiveness of split computing while keeping inference time comparable to existing approaches.

Featured Image

Why is it important?

Edge AI is rapidly growing in applications such as IoT, smart devices, healthcare, and autonomous systems. However, many edge devices lack sufficient computational power to run large deep neural networks entirely on-device. Split Learning provides a promising solution, but its performance heavily depends on selecting the right split point. Existing methods rely on heuristics or exhaustive searches, which are inefficient or inflexible under changing bandwidth and device conditions. Our work is important because it introduces the first Genetic Algorithm-based global optimization framework for split point selection in Split Learning. Compared to heuristic-based Dynamic Split Computing methods, our approach achieves a significantly higher split computing utility rate (74.38% vs. 28.70%) while maintaining similar inference latency. This makes Split Learning more practical, adaptive, and scalable for real-world edge AI deployments, especially in dynamic network environments.

Perspectives

From my perspective, one of the biggest challenges in deploying AI to real-world edge environments is not model accuracy, but system-level optimization. The question is no longer “How accurate is the model?” but rather “How can we run it efficiently under real constraints?” This work reflects my interest in bridging AI algorithms and system optimization. Instead of designing a new neural network architecture, we focus on optimizing how existing large-scale models can be deployed in practical client–server environments. I believe this direction is important for future 6G-enabled edge intelligence systems, where dynamic adaptation to bandwidth and device heterogeneity will be essential. In the future, we plan to extend this work toward multi-objective optimization, considering not only latency but also energy consumption and privacy constraints.

Trung Le Hoang
Universita degli Studi di Siena

Read the Original

This page is a summary of: Latency-Aware Split Learning Optimization via Genetic Algorithms, June 2025, ACM (Association for Computing Machinery),
DOI: 10.1145/3733566.3734433.
You can read the full text:

Read

Contributors

The following have contributed to this page