What is it about?

The article describes the creation of a digitized corpus of all Canadian parliamentary debates since 1901. The corpus is released publicly on the www.lipad.ca website.

Featured Image

Why is it important?

Releasing the digitized Hansard marks a significant step toward giving access to political data in Canada. The corpus represents hundreds of millions of words, covers more than a century of political history, and includes linked data about politicians and their affiliations. The database can be used to conduct research on an unprecedented scale in Canadian politics, and the social sciences more generally. The corpus can also be used for the development of applications in the fields of computational linguistics, artificial intelligence and machine learning.

Perspectives

I am Assistant Professor in the Department of Political Science at the University of Toronto. My research involves methods of natural language processing and machine learning to answer questions in the social sciences. I teach computer-assisted textual analysis at the University of Toronto, and have been involved with the Lipad team since 2014.

Ludovic Rheault
University of Toronto

Read the Original

This page is a summary of: Digitization of the Canadian Parliamentary Debates, Canadian Journal of Political Science, January 2017, Cambridge University Press,
DOI: 10.1017/s0008423916001165.
You can read the full text:

Read

Contributors

The following have contributed to this page