Hackathon develops methods to mitigate bias in large language models

Authors  Natalie Campbell
Date 13 November 2023

A new research project conceived at the ADM+S Hackathon has been awarded $10,000 to develop an AI co-pilot strategy to reduce human and machine bias in large language models such as ChatGPT.

The project, named Sub-Zero, was one of five projects developed over a two-day hackathon hosted by the ARC Centre of Excellence for Automated Decision-Making and Society.

“Our project was designed to observe how bias presents itself through qualitative data analysis, by leveraging the interpretative capabilities of large language models (GPT-4 & LLaMa 2) and also human coders,” explained Sub-Zero team member Ned Watt.

“It involved putting humans and language models head-to-head in a thematic analysis task using testimonies and parliamentary transcripts surrounding the Robodebt scandal. At times we found some eerie consistency between language models and humans, but at other times we found significant differences.”

Hackathon Judge Dr Johanne Trippas said, “the Sub-Zero Bias project introduced a perspective of human-AI collaboration for qualitative research as we grapple with the intricate dance of bias and perception within our AI era.

“The project was not only concerned with creating advanced qualitative research mechanisms but also about ingraining creativity and self-reflection into the process. The proposed human-AI collaboration does not just navigate the data; it invites us to scrutinise our biases and preconceptions to pursue more nuanced research outcomes.”

The hackathon brought together PhD students, and researchers from various disciplines to investigate and map the values baked into some of the most popular generative language models, including Open AI’s ChatGPT, Google’s Bard, and Microsoft’s Bing Chat.

Hackathon organiser Sally Storey explained, “with a small number of major corporations controlling the most advanced generative language models, these companies hold significant influence over various aspects of the information landscape. As a result, the values ingrained in their models have critical social and political implications.”

Teams were encouraged to choose an area of bias to investigate, from gender bias, political bias, Indigenous, colonial and racial biases, to disability discrimination.

Hackathon teams received feedback on their ideas from a panel of judges which included Peter Bailey (Canva), Nick Craswell (Microsoft Search), Dr Johanne Trippas (RMIT University), Sarvnaz Karimi (CSIRO) and Prof Chirag Shah (University of Washington).

2023 Hackathon Projects:

Sub-Zero Bias. A Comparative Thematic Analysis Experiment of Robodebt Discourse Using Humans and LLMs
Liam Magee (mentor), Dr Lida Ghahremanlou (mentor), Ned Watt, Hiruni Kegalle, Rhea D’Silva, Daniel Whelan-Shamy, Awais Hameed Khan

This project investigated human and machine bias in the context of large language models like GPT-4 and Llama 2 for Qualitative Data Analysis (QDA). It highlighted the challenges of addressing bias which stem from probabilistic reasoning and extensive human-generated training data. Rather than focusing solely on detecting and mitigating bias, the project introduced an AI co-pilot strategy for QDA, which aims to reduce the cognitive burden on researchers while encouraging them to reflect on their own biases and how they might influence research outcomes.

Polls and Prejudices: Investigating Bias in LLM-Generated Political Personas
Ash Watson (mentor), Hadi Dolatabadi, Marwah Alaofi, Arjun Srinivas, Mohammad Faisal

The study examined political bias using three Large Language Models (LLMs): Bing, ChatGPT 3.5, and LLAMA213-B, and how this impacts AI generated advertising content and communication strategies. The research considers how communication professionals might use LLMs to develop voter personas and strategies for advertising political content. The study applied persona-focused prompts based on Australian research and discovered significant inaccuracies and representative biases in these models. The project’s goal was to identify biases in LLM-generated personas, aiming to support and improve the use, understanding and oversight of Generative AI for audience engagement and in public interest campaigning.

Generative Storytelling Generating Biases: Investigating Gender, Racial, and Disability Bias in LLM Chatbots
Danula Hettiachchi (mentor), Jen Wilson, Yunus Yigit, Anand Badola

Stories significantly influence our understanding of the world and ourselves, particularly in shaping the imagination of young children. Children not only learn attitudes and behaviours from story characters but also form perspectives on identity and gender through them. This project investigates how moral tales for children, generated by LLMs, exhibit biases related to gender, race, and ability. Inspired by recent research which highlights gender and racial bias in children’s storybooks, this research focuses on chatbots driven by large language models (LLMs) and examines the implicit biases in these models when generating stories for 10-year-old children.

Detecting Australian Immigration Biases in ChatGPT
Dr Abdul Karim Obeid (mentor), Dr Silvia X. Montaña-Niño (mentor), Ekaterina Tokereva, Brooke Anne Coco, Vishnuprasad Padinjaredath Suresh, and Yonchanok Khaokaew 

This team conducted a case study into the experimentation of large language models (LLMs) by the Australian Department of Home Affairs. While departments such as Home Affairs have blocked (ChatGPT’s) usage, it has been found that exceptions exist where certain teams can still apply it. A freedom of information request indicated that no contemporaneous records were kept of all questions or “prompts” entered into ChatGPT or other tools as part of the tests – this has affected the interest of the public to hold the department to account over the potential for incorrectly applying these technologies. This project interrogated what such “potentially incorrect applications of the technology” could look like by creating a ‘ChatGPT’-invoked bespoke immigration persona entitled “ImmoGAN”, which was comprised of an instruction given to ChatGPT to ‘role-play’ as what it would interpret an immigration officer’s behaviours should entail. The team used a red-teaming approach to emulate scenarios that could be used to provoke and identify vulnerabilities in the software, for the ethical duties of the Australian Department of Home Affairs.

Response Personalization in Large Language Models
Mark Andrejevic (mentor), Hmdh M Alknjr, Stephanie Livingstone, Wynston Lee, Frances Shaw, Hao Xue

The goal of this project was to explore what types of bias might be incorporated into personalised automated responses to timely political issues by crafting a series of prompts to test how three LLMs would frame issues raised by the Voice to Parliament, and separately, COVID-19 vaccines to different groups of Australians. The team found that LLMs attempted to endow their responses with a sense of personality. These personas were crude and patronising in their style and tone, as was their use of different metaphors in explanations to different groups. Identifying damaging stereotypes, the team proposed approaches for scaling up personalisation-focused tests to reveal the ways in which stereotyping might incorporate formulations that reproduce bias and stereotypes.

Sub-Zero Bias were awarded first place and received $10,000 in research support to continue developing their project guided by an ADM+S senior researcher<.>The winning team will also travel to Canva’s Sydney office to present their findings in December 2023, and learn about Canva’s responsible artificial intelligence work and interests during a workshop.”

“Using the prize money, future research design will assess the extent to which we can use this methodology to measure bias in human researchers, as well as bias in language models, and explore whether language models can be leveraged effectively and safely alongside human coders in qualitative research,” said Ned.

The Hackathon was delivered by the ADM+S Research Training Program.

SEE ALSO