Multi-step Jailbreaking Privacy Attacks on ChatGPT

18 May 2023

Haoran Li, Wei Fan, Mingshi Xu, Yangqiu Song (Dept. of CSE, Hong Kong University of Science and Technology), Dadi Guo (Center for Data Science, AAIS, Peking University), Jie Huang (Dept. of Computer Science, University of Illinois at Urbana-Champaign), Fanpu Meng (The Law School, University of Notre Dame)

Abstract

As powerful LLMs are devouring existing text data from various domains (e.g., GPT-3 is trained on 45TB of text), it is natural to doubt whether the private information is included in the training data and what privacy threats can these LLMs and their downstream applications bring.

In this paper, we study the privacy threats from OpenAI’s ChatGPT and the New Bing enhanced by ChatGPT and show that application-=integrated LLMs may cause new privacy ythreats. To this end, we conduct extensive experiments to support our claims and discuss LLMs’ privacy implications.

DOWNLOAD THE FULL PAPER

Multi-step Jailbreaking Privacy Attacks on ChatGPT

more insights

Prompt-Specific Poisoning Attacks on Text-to-Image Generative Models

Tree of Attacks: Jailbreaking Black-Box LLMs Automatically

Scalable Extraction of Training Data from (Production) Language Models