Abstract
Assessment of target-language quality in interpreting is considered one of the most important aspects in interpreter training, which is very time- and effort-consuming but has been under-explored in previous studies, particularly in leveraging AI technologies to facilitate automatic assessment. This study investigates the capability of LLMs, specifically GPT and Claude, in facilitating automatic assessment of target-language quality in interpreting. We conducted a descriptive analysis of the scores generated by LLMs and correlated them with human evaluation. Additionally, we examined the processes and rating criteria of LLMs by comparing revisions made by these models to enhance target-language quality. Our analysis of the differences between human evaluation and LLM scores, along with feedback from LLMs on their scoring rationale, suggests that LLMs can be applied to assess target-language quality in interpreting. The study revealed that while there is a general alignment between human and automatic assessments, discrepancies occur in approximately 7.6% of the cases. These discrepancies often involve differences in sentence structure, complexity, vocabulary, register and flow, underscoring divergent perceptions of quality between humans and LLMs. This study indicates the potential of applying AI technology to supplement traditional human evaluations.
| Original language | English |
|---|---|
| Pages (from-to) | 465-485 |
| Number of pages | 21 |
| Journal | The Interpreter and Translator Trainer |
| Volume | 19 |
| Issue number | 3-4 |
| Early online date | 17 Jul 2025 |
| DOIs | |
| Publication status | Published - 2 Oct 2025 |
Keywords
- Automatic assessment of interpreting
- automatic metrics
- explainable AI
- large language models
- target-language quality
ASJC Scopus subject areas
- Education
- Language and Linguistics
- Linguistics and Language
Fingerprint
Dive into the research topics of 'Advancing automatic assessment of target-language quality in interpreter training with large language models: insights from explainable AI'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver