KHOI VO NGUYEN
Thesis Title
LLMs may be fooled into labeling documents of a different language (to the query) as relevant
Research Description
Building effective information retrieval (IR) systems requires reliable evaluation of how well retrieved documents meet users’ needs. Traditionally, this evaluation has relied on expert-annotated test collections (e.g., TREC), but manual judging is slow and costly. Since around 2023, researchers have begun exploring the use of large language models (LLMs) to automate the assessment of relevance. While these approaches offer clear advantages in speed and scalability, they also introduce new risks and biases. It is therefore essential to identify these potential vulnerabilities and understand how they might be exploited. Previous work has shown that LLMs can be misled into labeling non-relevant passages as relevant through the injection of specific keywords. Our research builds on this finding by examining how malicious actors might exploit this weakness and by identifying potential mitigation strategies. We test different injection techniques, both in English and other languages, on existing test collection passages and quantify how LLM-based judgments deviate from expert annotations.
Supervisors
Professor Mark Sanderson, RMIT University
Dr Oleg Zendel, RMIT University



