Assessing the Early Bird Heuristic (for Predicting Project Quality)

Shrikanth N. C.; Shrikanth N. C.; Tim Menzies

doi:10.1145/3583565

What is it about?

In this work, we show that a machine-learning model built just using the first 150 data points of a software engineering project and two features predict software defects just as (and better) than those that are built using thousands of data points and many features (data-hungry models). The data-lite method devised in this work augments transfer learning methods speeds up the transfer learning processes, and identifies defective changes in software projects than machine learning methods that use complex ensemble and tuning (hyper-parameter optimization) approaches.

Photo by James Hammond on Unsplash

Why is it important?

Fixing software defects is not cheap, therefore it is very useful to prevent them. The techniques used to predict software defects using machine learning are data-hungry. Data-hungry methods are computing heavy (more memory and processing power) and are not explainable. As we march towards the end of Moore's Law it is essential to build models that not only consume less memory and processing power but are also green (friendly to the environment) and fairer (explainable).

Perspectives

A prevalent misconception is to think that more data is inherently better to make an accurate prediction. Very soon there will be much traction in the Artificial Intelligence space to seek methods to build models very cheaply. In this paper, we offer shortcuts to simplify software analytics. We can achieve similar performance with data-lite machine learning models than using data-hungry machine learning models.
Shrikanth N.C.
North Carolina State University

This page is a summary of: Assessing the Early Bird Heuristic (for Predicting Project Quality), ACM Transactions on Software Engineering and Methodology, July 2023, ACM (Association for Computing Machinery),
DOI: 10.1145/3583565.
You can read the full text:

Read

Resources

Video
Early Life Cycle Software Defect Prediction. Why? How?
Early Life Cycle Software Defect Prediction. Why? How? ICSE 2021

Contributors

The following have contributed to this page

One way to build faster, more accurate, and explainable Data-lite Machine Learning models

What is it about?

Why is it important?

Perspectives

Resources

Early Life Cycle Software Defect Prediction. Why? How?

Contributors

You might also like

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management

One way to build faster, more accurate, and explainable Data-lite Machine Learning models

What is it about?

Featured Image

Why is it important?

Perspectives

Read the Original

Resources

Early Life Cycle Software Defect Prediction. Why? How?

Contributors

Share this page:

You might also like

What Are the Causes and Cures of Poor Megaproject Performance? A Systematic Literature Review and Research Agenda

Establishing Context

Copula-Based Anomaly Scoring and Localization for Large-Scale, High-Dimensional Continuous Data

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management