What is it about?

Developers use learned models to build Software-2.0systems. This novel paradigm gained sub-stantial popularity over the past years. A major contributor to the rapid growth of ML are the ML libraries as they simplify the huge complexity of implementing Software-2.0 systems. However, the rate of growth will slow without understanding the practices and challenges of maintaining and evolving ML libraries in Software-2.0 systems. In this paper we use complementary empirical methods (mining 3,340 software repositories containing over 809,534 ML constructs, and conducting surveys with 109 avid users of ML libraries) to answer six broad research questions. Some of ourkey findings are: (1) Software-2.0 projects that use ML Libraries are rapidly increasing. This is not a fad, but it is an established trend, (2) Developers use multiple ML libraries to implement ML development workflows, (3) ML library updates result in cascading library updates, (4) ML library updates pose more challenges (e.g, binary incompatibility of pre-trained ML models,ML models benchmarking, etc.) than source code adaptations, (5) Software-2.0 developers use tools from Software-1.0 (e.g., TravisCI, AppVeyor) that are not suitable for their new needs. We discovered several new tools (e.g., reporting model coverage,benchmarking ML models, etc.) for evolving Software-2.0 systems, and (6) Retrofitting ML libraries for mature projects is challenging. We uncover eight different barriers including in accessible data, not-enough processing/battery power in edge devices, and inadequate developer training/documentation. We hope that this paper serves as a call to action to address unique challenges when maintaining and evolving Software-2.0 systems. Our goal is to inspire a symbiotic ecosystem where researchers, tool builders, library vendors, and hardware vendors work together to assist developers to wards creating betterSoftware-2.0 systems.

Featured Image

Why is it important?

Despite some folkloric evidence about the widespread use of ML libraries, there is no systematic study about the use of ML libraries in Software-2.0.How do developers maintain and evolve ML libraries in Software-2.0?What challenges do developers face when using ML libraries? What are the tools the developers are currently using to evolve Software-2.0? What are the new opportunities to better assist Software-2.0 developers? We have very little quantitative andqualitative knowledge to answer these questions. This lack of knowledge negatively impacts developers, tool builders, researchers, hardware vendors, and library vendors. To fill in these crucial gaps, we employ complementary established research methodologies.

Perspectives

This paper provides many problems that ML developers face when developing ML software. Therefore, it is one of the best venues to understand new research opportunities in Machine Learning and Software Engineering.

Malinda Dilhara
University of Colorado Boulder

Read the Original

This page is a summary of: Understanding Software-2.0, ACM Transactions on Software Engineering and Methodology, July 2021, ACM (Association for Computing Machinery),
DOI: 10.1145/3453478.
You can read the full text:

Read

Resources

Contributors

The following have contributed to this page