What is it about?
Certain forms of linguistic annotation, like part of speech and semantic tagging, can be automated with high accuracy. However, manual annotation is still necessary for complex pragmatic and discursive features that lack a direct mapping to lexical forms. This manual process is time-consuming and error-prone, limiting the scalability of function-to-form approaches in corpus linguistics. To address this, our study explores the possibility of using large language models (LLMs) to automate pragma-discursive corpus annotation. We compare GPT-3.5 (the model behind the free-to-use version of ChatGPT), GPT-4 (the model underpinning the precise mode of Bing chatbot), and a human coder in annotating apology components in English based on the local grammar framework. We find that GPT-4 outperformed GPT-3.5, with accuracy approaching that of a human coder. These results suggest that LLMs can be successfully deployed to aid pragma-discursive corpus annotation, making the process more efficient, scalable, and accessible.
Featured Image
Why is it important?
We demonstrate that LLMs can support text annotation tasks with natural language prompts. This is important because it indicates a new methodological approach to text and discourse analysis.
Perspectives
Our exploration of LLMs for text annotation began in early 2023. While initial attempts with ChatGPT-3.5 proved unsuccessful, we were excited to find that the release of GPT-4 through Bing's chatbot demonstrated the viability of this approach. This article presents a case study of LLM-assisted discourse annotation. We hope it can be a useful example for discourse analysts to effectively integrate LLMs into their analytical methodologies.
danni yu
Beijing Foreign Studies University
Read the Original
This page is a summary of: Assessing the potential of LLM-assisted annotation for corpus-based pragmatics and discourse analysis, International Journal of Corpus Linguistics, June 2024, John Benjamins,
DOI: 10.1075/ijcl.23087.yu.
You can read the full text:
Contributors
The following have contributed to this page







