This article recommends that social scientists think about Type I errors in a different way.
Photo by davide ragusa on Unsplash
What is it about?
In this paper, I consider two types of replication in relation to two types of Type I error probability. First, I consider the distinction between exact and direct replications. Exact replications duplicate all aspects of a study that could potentially affect the results. In contrast, direct replications duplicate only those aspects of the study that are thought to be theoretically essential to reproduce the original effect. Second, I consider two types of Type I error probability. The Neyman-Pearson Type I error rate refers to the maximum frequency of incorrectly rejecting a null hypothesis if a test was to be repeatedly reconducted on a series of different random samples that are all drawn from the exact same null population. Hence, the Neyman-Pearson Type I error rate refers to a long run of exact replications. In contrast, the Fisherian Type I error probability is the probability of incorrectly rejecting a null hypothesis in relation to the particular sample under consideration. Hence, the Fisherian Type I error rate refers to a one-off sample rather than a series of samples that are drawn during a long run of exact replications. I argue that social science deals with irreversible units (people, social groups, and social systems) that make exact replications impossible. Consequently, I argue that the Neyman-Pearson Type I error rate is not meaningful in social science, because it relies on a concept of exact replication that is impossible to implement in social science. Instead, the Fisherian Type I error probability is more appropriate in social science because it refers to one-off samples. Specifically, I argue that the Fisherian Type I error probability can be applied to each sample-specific decision about rejecting the same substantive null hypothesis in a series of direct replications. I conclude that the replication crisis may be partly (not wholly) due to researchers’ unrealistic expectations about replicability based on their consideration of the Neyman-Pearson Type I error rate across a long run of exact replications.
Why is it important?
This article has important implications for scientists who use significance testing. In particular, it affects the meaning and interpretation of “statistically significant results” as well as expectations regarding the replication of these results.
The following have contributed to this page: A/Prof Mark Rubin
In partnership with: