Adversarial attacks on generative AI models
In the domain of large language models, such as ChatGPT, the practice of paraphrasing has garnered notable interest. Researchers have leveraged paraphrasing techniques to explore a spectrum of applications, from bolstering the defense of these models against adversarial attacks to potentially exploiting their vulnerabilities. This research undertakes an exploration of the multifaceted role of paraphrasing within this context. It delves into the intricacies of paraphrasing, its impact on defensive and adversarial mechanisms, and seeks to develop an algorithm that can discern the various implications of paraphrasing in the broader landscape of large language models.
Prof Chris Leckie, University of Melbourne
Dr Sarah Erfani, University of Melbourne
Dr Hadi Dolatabadi, University of Melbourne