A new generic method to improve machine learning applications in official statistics

Kevin Kloos

doi:10.3233/sji-210885

What is it about?

Machine learning applications perform well in estimating which unit belongs to which class, better known as classification. The field of machine learning has put a lot of effort into developing well-performing classification algorithms over the last decennia. However, using classification algorithms in official statistics to count the number of units that belong to a class always introduces a bias, known as misclassification bias. Misclassification bias does not occur in traditional applications of machine learning and therefore it has received little attention in the academic literature. In this paper, we introduce a new method to correct for misclassification bias that performs better than traditional methods.

Photo by Crissy Jarvis on Unsplash

Why is it important?

National statistical institutes are often interested in counting the number of units that belong to a specific class. Machine learning applications can help in collecting interesting new fields of statistics. Estimating the number of units that belong to a specific class with machine learning applications leads to misclassification bias, which leads to less accurate statistics. The method presented in this paper improves the quality of official statistics using machine learning applications.

Perspectives

Quantification with machine learning applications has not received a lot of attention, while it is very important in many fields. Many more research opportunities lay in this field and this paper is a contribution to the existing theory in quantification.
Kevin Kloos
Universiteit Leiden

This page is a summary of: A new generic method to improve machine learning applications in official statistics, Statistical Journal of the IAOS, November 2021, IOS Press,
DOI: 10.3233/sji-210885.
You can read the full text:

Read

Contributors

The following have contributed to this page

Kevin Kloos
Universiteit Leiden

A new generic method to improve machine learning applications in official statistics

What is it about?

Why is it important?

Perspectives

Contributors

You might also like

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management

A new generic method to improve machine learning applications in official statistics

What is it about?

Featured Image

Why is it important?

Perspectives

Read the Original

Contributors

Share this page:

You might also like

Stylized facts of statistical standards of the Statistical Commission of the United Nations

Examining the relationship between e-service recovery quality and e-service recovery satisfaction moderated by perceived justice in the banking context

Do women lag behind men? A matched-sample analysis of the dynamics of gender gaps

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management