What is it about?

We consider a geographically distributed ML scenario, where local streaming data are processed at edge computing infrastructure, and local updates are synchronized through a central parameter server. Here, major challenges to ML accuracy are: 1) Low diversity of data feeding each model (i.e. insularity) 2) Intermittent network connectivity 3) Frequent changes in input data (i.e. non-stationarity).

Featured Image

Why is it important?

We propose an efficient, reinforcement learning based algorithm, SCEDA, which optimizes the schedule and content of the model updates from parameter server to edge servers in order to avoid stale ML models. It makes online scheduling decisions by learning individual network connectivity trends of edge servers as well as the significance of their updates. To the best of our knowledge, SCEDA is the first staleness control mechanism, where the synchronization period is not defined by static thresholds but learned from data and adapted to the environmental changes over time. The impact of this work goes far beyond our initial use case scenarios of electric vehicles or virtual reality and it is possibly applicable to many stateful learning tasks on distributed and streaming big data, in general.

Perspectives

Writing this article was a great pleasure as it has co-authors with whom I have initialized strong collaborations on an emerging topic.

Atakan Aral
Technische Universitat Wien

Read the Original

This page is a summary of: Staleness Control for Edge Data Analytics, Proceedings of the ACM on Measurement and Analysis of Computing Systems, June 2020, ACM (Association for Computing Machinery),
DOI: 10.1145/3392156.
You can read the full text:

Read

Resources

Contributors

The following have contributed to this page