What is it about?

Machine learning applications perform well in estimating which unit belongs to which class, better known as classification. The field of machine learning has put a lot of effort into developing well-performing classification algorithms over the last decennia. However, using classification algorithms in official statistics to count the number of units that belong to a class always introduces a bias, known as misclassification bias. Misclassification bias does not occur in traditional applications of machine learning and therefore it has received little attention in the academic literature. In this paper, we introduce a new method to correct for misclassification bias that performs better than traditional methods.

Featured Image

Why is it important?

National statistical institutes are often interested in counting the number of units that belong to a specific class. Machine learning applications can help in collecting interesting new fields of statistics. Estimating the number of units that belong to a specific class with machine learning applications leads to misclassification bias, which leads to less accurate statistics. The method presented in this paper improves the quality of official statistics using machine learning applications.

Perspectives

Quantification with machine learning applications has not received a lot of attention, while it is very important in many fields. Many more research opportunities lay in this field and this paper is a contribution to the existing theory in quantification.

Kevin Kloos
Universiteit Leiden

Read the Original

This page is a summary of: A new generic method to improve machine learning applications in official statistics, Statistical Journal of the IAOS, November 2021, IOS Press, DOI: 10.3233/sji-210885.
You can read the full text:

Read

Contributors

The following have contributed to this page