Abstract
Our paper presents the findings of a study on covert hate speech detection using a combination of corpus linguistics and sentiment analysis. The study, funded by the British Academy and the European Regional Development Fund, investigates below-the-line (BTL) comments on Ukrainians expressed in Polish, with a focus on the theme of self-victimisation. In our analyses, we used two corpora (a representative and a tailored corpus) and used a two-tier methodology: in the quantitative phase, we extracted a self-victimisation trope via Sketch Engine, and in the qualitative phase, the focus was on understanding the general sentiment of the BTL comments about Ukrainians. To do this, we used two software packages, MultiEmo and Hate Speech (CLARIN-PL). Our findings show that on average 30% of the comments feature negative sentiment, however the self-victimisation trope, investigated through the lexemes we-they, victim, Pole, Ukrainian, is notable and worthy of further investigation. Recommendations are provided to fine-tune the software and increase manual annotation in follow-up studies.
| Original language | English |
|---|---|
| Journal | Applied Linguistics Review |
| Publication status | Accepted/In press - 27 Nov 2025 |
Keywords
- covert hate speech
- hate speech
- online hate
- sentiment analysis
- corpus linguistics
- Corpus methodologies
- victimization
- Ukraine
- Poland