What is it about?

The article assesses different measures of word commonness and explore how each measure performs on a selection of occurrence distributions of Norwegian compound words. This procedure elucidates the strenghts and weaknesses of the measures in question, and shows for example that the predictive accuracy of frequency measures is low when the word distribution is uneven.

Featured Image

Why is it important?

The article is written with a lexicographical context in mind. Many lexicographical projects would benefit from being able to tell which word are more common than others, as this will help them identify a reasonable selection of words to include and exclude. The assessment of word commoness does however have many other applications. For example whenever words feature as stimulus in a psycholinguistic experiment, the underlying commonness of a given word will influence e.g. the reaction time of the respondents encountering that word. Further, when assessing the readability of a text, word commonness plays a role.


The article is a part of a ph.d-project on how to select compounds for large monolingual dictionaries. Important questions in this respect are 1) how can one separate common from uncommon compounds?, 2) what constitutes an opaque compound, and 3) which compounds are looked up by dictionary user?

Mikkel Ekeland Paulsen
Universitetet i Bergen

Read the Original

This page is a summary of: Assessing word commonness, International Journal of Corpus Linguistics, November 2022, John Benjamins, DOI: 10.1075/ijcl.21037.eke.
You can read the full text:



The following have contributed to this page