Skip to main navigation Skip to search Skip to main content

SHADES: Towards a Multilingual Assessment of Stereotypes in Large Language Models

  • Margaret Mitchell
  • , Hamdan Al-Ali
  • , Giuseppe Attanasio
  • , Ioana Baldini
  • , Miruna Clinciu
  • , Jordan Clive
  • , Pieter Delobelle
  • , Manan Dey
  • , Kaustubh Dhole
  • , Timm Dill
  • , Amirbek Djanibekov
  • , Tair Djanibekov
  • , Jad Doughman
  • , Ritam Dutt
  • , Jessica Zosa Forde
  • , Jay Gala
  • , Avijit Ghosh
  • , Sil Hamilton
  • , Carolin Holtermann
  • , Jerry Huang
  • Lucie Aimée Kaffee, Janavi Kasera, Tanmay Laud, Anne Lauscher, Roberto Luis López, Jonibek Mansurov, Maraim Masoud, Sagnik Mukherjee, Nurdaulet Mukhituly, Nikita Nangia, Shangrui Nie, Anaelia Ovalle, Giada Pistilli, Esther Ploeger, Jeremy Qin, Dragomir Radev, Vipul Raheja, Beatrice Savoldi, Shanya Sharma, Xudong Shen, Karolina Stańczak, Arjun Subramonian, Kaiser Sun, Eliza Szczechla, Tiago Timponi Torrent, Deepak Tunuguntla, Emilio Villa-Cueva, Marcelo Viridiano, Oskar van der Wal, Adina Yakefu, Kayo Yin, Mike Zhang, Sydney Zink, Aurélie Névéol, Zeerak Talat

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Large Language Models (LLMs), the bedrock of many “artificial intelligence” (AI) applications, are known to reproduce social biases present in their training data. Yet resources to measure and control this issue are limited. Research identifying and mitigating stereotype biases have primarily been concentrated around English, lagging the rapid advancement of LLMs in multilingual settings. To help further advance the ability to address stereotype bias in AI systems, we introduce a new multilingual dataset: SHADES. Designed for examining culturally-specific stereotypes that may be learned by LLMs, SHADES includes over 300 stereotypes from 37 regions, translated across 16 languages and annotated with multiple features to aid multilingual stereotype analysis. All statements in all languages are paired with templates, to serve as a resource for unlimited generation of new evaluation data. We demonstrate the utility of the dataset in a series of exploratory evaluations that reveal significant differences in how stereotypes are recognized and reflected across models and languages.

Original languageEnglish
Title of host publicationProceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics
Subtitle of host publicationHuman Language Technologies
EditorsLuis Chiruzzo, Alan Ritter, Lu Wang
PublisherAssociation for Computational Linguistics
Pages11995-12041
Number of pages47
Volume1
ISBN (Electronic)9798891761896
DOIs
Publication statusPublished - Apr 2025
Event2025 Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, - Hybrid, Albuquerque, United States
Duration: 29 Apr 20254 May 2025

Conference

Conference2025 Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics
Abbreviated titleNAACL-HLT 2025
Country/TerritoryUnited States
CityHybrid, Albuquerque
Period29/04/254/05/25

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Hardware and Architecture
  • Information Systems
  • Software

Fingerprint

Dive into the research topics of 'SHADES: Towards a Multilingual Assessment of Stereotypes in Large Language Models'. Together they form a unique fingerprint.

Cite this