What is it about?
This paper explains why correspondence tables are important for classifications, and describes the newly developed "correspondenceTables" R package. It provides practical examples to show its strengths and weaknesses when it comes to alleviating the task load of statistical classification experts so that they can focus on tasks where their expertise is needed.
Featured Image
Photo by Omar Flores on Unsplash
Why is it important?
It is already possible to automatically create correspondence tables between two classifications by means of a big "outer join" of intermediate correspondence tables. However, as is so often the case, the main challenge is input data quality: intermediate correspondence tables are typically set up for other purposes than automatic correspondence table creation, and just feeding them into an "outer join" may lead to candidate correspondence tables that may be inappropriate (by being incomplete or by containing misleading records). The main added value of the correspondenceTables R package that we present in this paper is thus the extensive quality control that is being carried out - including the flagging of problematic records. By applying the package (sometimes repeatedly to fix all the quality issues), statistical classification experts will be provided with a candidate correspondence table where only the tricky (e.g. many-to-many) cases are highlighted, allowing them to focus on the most challenging records instead of having to carry out tasks of a more clerical nature.
Perspectives
Read the Original
This page is a summary of: An R package for automatically generating candidate correspondence tables between classifications, Statistical Journal of the IAOS, December 2023, IOS Press,
DOI: 10.3233/sji-230039.
You can read the full text:
Resources
Contributors
The following have contributed to this page