PROJECT SUMMARY

Person with colourful text overlay

The Toxicity Scalpel: Prototyping and evaluating methods to remove harmful generative capability from foundation models

Focus Areas: News and Media
Research Programs: Machines
Status: Active

AI language models have made significant strides over the past few years. Computers are now capable of writing poetry and computer code, producing human-like text, summarising documents, engaging in natural conversation about a variety of topics, solving math problems, and translating between languages.

This rapid progress has been made possible by a trend in AI development where one general ‘foundational’ model is developed (usually using a large dataset from the internet) and then adapted many times to fit diverse applications, rather than beginning from scratch each time.

This method of ADM development can appear time and cost effective, but ‘bakes in’ negative tendencies like the creation of toxic content, misogyny, or hate speech at the foundational layer, which subsequently spread to each downstream application.

The goal of this project is to examine how language models used in ADM systems might be improved by making modifications at the foundation model stage, rather than at the application level, where computational interventions, social responsibility, and legal liability have historically focussed.

PUBLICATIONS

First page of Journal Article: Measuring Misogyny in Natural Language Generation: Preliminary Results from a Case Study on two Reddit Communities

Measuring Misogyny in Natural Language Generation: Preliminary Results from a Case Study on two Reddit Communities,2023

Snoswell, A., Nelson, L., Xue, H., Salim, F., Suzor, N., & Burgess, J.

Journal article

RESEARCHERS

ADM+S Investigator Flora Salim

Prof Flora Salim

Chief Investigator,
UNSW

Learn more

ADM+S Chief Investigator Nic Suzor

Prof Nic Suzor

Chief Investigator,
QUT

Learn more

Hao Xue

Dr Hao Xue

Associate Investigator,
UNSW

Learn more

Dr Aaron Snoswell

Dr Aaron Snoswell

Research Fellow,
QUT

Learn more

Lucinda Nelson

Lucinda Nelson

PhD Student,
QUT

Learn more