What is it about?
We propose two approaches to "sanitize" sentence embeddings extracted by large language models (say, BERT). Theoretically, they ensure metric-based local differential privacy (LDP). Empirically, they can defend against various (embedding-based) attacks, such as membership inference and sensitive attribute inference. We further propose two extensions: One is to fully protect training inputs by additionally sanitizing labels; the other is to address the "curse of dimensionality" by introducing two trainable linear maps.
Featured Image
Photo by Marija Zaric on Unsplash
Why is it important?
We are now in the era of LLMs. Web-based LLM APIs (e.g., ChatGPT and BARD) are trendy. Directly submitting users' text data (e.g., prompts) is risky and may reveal their sensitive information. Our approaches enable users to sanitize/perturb features (i.e., sentence embeddings) of their text inputs to protect privacy. Meanwhile, the sanitized features can still be used in various downstream tasks (at the cost of moderate utility reduction).
Read the Original
This page is a summary of: Sanitizing Sentence Embeddings (and Labels) for Local Differential Privacy, April 2023, ACM (Association for Computing Machinery),
DOI: 10.1145/3543507.3583512.
You can read the full text:
Resources
Contributors
The following have contributed to this page