ParlVote: A corpus for sentiment analysis of political debates

Gavin Abercrombie, Riza Theresa Batista-Navarro

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Debate transcripts from the UK Parliament contain information about the positions taken by politicians towards important topics, but are difficult for people to process manually. While sentiment analysis of debate speeches could facilitate understanding of the speakers' stated opinions, datasets currently available for this task are small when compared to the benchmark corpora in other domains. We present ParlVote, a new, larger corpus of parliamentary debate speeches for use in the evaluation of sentiment analysis systems for the political domain. We also perform a number of initial experiments on this dataset, testing a variety of approaches to the classification of sentiment polarity in debate speeches. These include a linear classifier as well as a neural network trained using a transformer word embedding model (BERT), and fine-tuned on the parliamentary speeches. We find that in many scenarios, a linear classifier trained on a bag-of-words text representation achieves the best results. However, with the largest dataset, the transformer-based model combined with a neural classifier provides the best performance. We suggest that further experimentation with classification models and observations of the debate content and structure are required, and that there remains much room for improvement in parliamentary sentiment analysis. © European Language Resources Association (ELRA), licensed under CC-BY-NC
Original languageEnglish
Title of host publicationProceedings of the 12th Language Resources and Evaluation Conference
Pages5073‑5078
Publication statusPublished - 2020
Event12th International Conference on Language Resources and Evaluation 2020 - Marseille, France
Duration: 11 May 202016 May 2020
https://lrec2020.lrec-conf.org/en/index.html

Conference

Conference12th International Conference on Language Resources and Evaluation 2020
Abbreviated titleLREC 2020
Country/TerritoryFrance
CityMarseille
Period11/05/2016/05/20
Internet address

Fingerprint

Dive into the research topics of 'ParlVote: A corpus for sentiment analysis of political debates'. Together they form a unique fingerprint.

Cite this