Grounding LLMs to In-prompt Instructions: Reducing Hallucinations Caused by Static Pre-training Knowledge

Angus Addlesee*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)
24 Downloads (Pure)

Abstract

When deploying LLMs in certain commercial or research settings, domain specific knowledge must be explicitly provided within the prompt. This in-prompt knowledge can conflict with an LLM’s static world knowledge learned at pre-training, causing model hallucination (see examples in Table 1). In safety-critical settings, like healthcare and finance, these hallucinations can harm vulnerable users. We have curated a QA corpus containing information that LLMs could not have seen at pre-training. Using our corpus, we have probed various LLMs, manipulating both the prompt and the knowledge representation. We have found that our ‘Jodie’ prompt consistently improves the model’s textual grounding to the given knowledge, and in-turn the overall answer accuracy. This is true in both the healthcare and finance domains – improving accuracy by up to 28% (mean: 12%). We have also identified that hierarchical and direct node-property graph structures could lead to more interpretable and controllable systems that provide a natural language interface with real-time in-domain knowledge. Our corpus will enable further work on this critical challenge.
Original languageEnglish
Title of host publicationProceedings of Safety4ConvAI: The Third Workshop on Safety for Conversational AI at LREC-COLING 2024
PublisherEuropean Language Resources Association
Pages1-7
Number of pages7
ISBN (Print) 9782493814449
Publication statusPublished - 21 May 2024
EventJoint International Conference on Computational Linguistics, Language Resources and Evaluation 2024 - Lingotto Conference Centre, Torino, Italy
Duration: 20 May 202425 May 2024
https://lrec-coling-2024.org/

Conference

ConferenceJoint International Conference on Computational Linguistics, Language Resources and Evaluation 2024
Abbreviated titleLREC-COLING 2024
Country/TerritoryItaly
CityTorino
Period20/05/2425/05/24
Internet address

Keywords

  • conversational AI
  • corpus
  • knowledge grounding
  • LLM evaluation
  • question answering

ASJC Scopus subject areas

  • Language and Linguistics
  • Education
  • Library and Information Sciences
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Grounding LLMs to In-prompt Instructions: Reducing Hallucinations Caused by Static Pre-training Knowledge'. Together they form a unique fingerprint.

Cite this