What is it about?

This paper explains why correspondence tables are important for classifications, and describes the newly developed "correspondenceTables" R package. It provides practical examples to show its strengths and weaknesses when it comes to alleviating the task load of statistical classification experts so that they can focus on tasks where their expertise is needed.

Featured Image

Why is it important?

It is already possible to automatically create correspondence tables between two classifications by means of a big "outer join" of intermediate correspondence tables. However, as is so often the case, the main challenge is input data quality: intermediate correspondence tables are typically set up for other purposes than automatic correspondence table creation, and just feeding them into an "outer join" may lead to candidate correspondence tables that may be inappropriate (by being incomplete or by containing misleading records). The main added value of the correspondenceTables R package that we present in this paper is thus the extensive quality control that is being carried out - including the flagging of problematic records. By applying the package (sometimes repeatedly to fix all the quality issues), statistical classification experts will be provided with a candidate correspondence table where only the tricky (e.g. many-to-many) cases are highlighted, allowing them to focus on the most challenging records instead of having to carry out tasks of a more clerical nature.

Perspectives

The practical examples show both situations where the package excels (parallel merges in two classifications) and situations where the package generates a lot of "Cartesian noise" (parallel splits in two classifications). Like many great discoveries (not saying that this is one of them...) this is quite obvious with some hindsight. A practical approach (isolating out parallel splits) is presented for tackling this.

Dr Martin Karlberg
Eurostat

Read the Original

This page is a summary of: An R package for automatically generating candidate correspondence tables between classifications, Statistical Journal of the IAOS, December 2023, IOS Press,
DOI: 10.3233/sji-230039.
You can read the full text:

Read

Resources

Contributors

The following have contributed to this page