What is it about?

We compare "styles" of different sorts of speech with repect to their syntax. We do this by comparing the sequences of the parts of speech (POS) used. The sorts of language compared are the conversational English speech used in Australia by first-generation immigrants (after thirty years in the country) vs. that of their children who immigrated at 17 yr, and younger (at an average age of < 10).

Featured Image

Why is it important?

This was one of the first papers to show how natural language processing (NLP), here in the form of a part-of-speech (POS) tagger, could play a role in detecting syntactic differences. Applying NLP to conversation was a risky since POS taggers were developed on edited, carefully written texts, and indeed performance fell massively, but not so much that the project was endangered.

Perspectives

Althought the focus was on contact linguistics -- the speech of Finnish immigrants to Australia -- the study showed that NLP could be harnessed for the study of style differences, as are studied in stylometry (within digital humanities).

Professor John Nerbonne
Rijksuniversiteit Groningen

Read the Original

This page is a summary of: Automatically Extracting Typical Syntactic Differences from Corpora, Literary and Linguistic Computing, October 2010, Oxford University Press (OUP),
DOI: 10.1093/llc/fqq017.
You can read the full text:

Read

Contributors

The following have contributed to this page