Robust Multi-bit Text Watermark with LLM-based Paraphrasers
Abstract
We propose an imperceptible multi-bit text watermark embedded by paraphrasing with LLMs. We fine-tune a pair of LLM paraphrasers that are designed to behave differently so that their paraphrasing difference reflected in the text semantics can be identified by a trained decoder. To embed our multi-bit watermark, we use two paraphrasers alternatively to encode the pre-defined binary code at the sentence level. Then we use a text classifier as the decoder to decode each bit of the watermark. Through extensive experiments, we show that our watermarks can achieve over 99.99\% detection AUC with small (1.1B) text paraphrasers while keeping the semantic information of the original sentence. More importantly, our pipeline is robust under word substitution and sentence paraphrasing perturbations and generalizes well to out-of-distributional data. We also show the stealthiness of our watermark with LLM-based evaluation. We open-source the code: https://github.com/xiaojunxu/multi-bit-text-watermark.
Community
This paper proposes a way to inject watermark into texts by paraphrasing.
Code: https://github.com/xiaojunxu/multi-bit-text-watermark
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- WaterPark: A Robustness Assessment of Language Model Watermarking (2024)
- Revisiting the Robustness of Watermarking to Paraphrasing Attacks (2024)
- CredID: Credible Multi-Bit Watermark for Large Language Models Identification (2024)
- $B^4$: A Black-Box Scrubbing Attack on LLM Watermarks (2024)
- A Watermark for Order-Agnostic Language Models (2024)
- De-mark: Watermark Removal in Large Language Models (2024)
- Provably Robust Watermarks for Open-Source Language Models (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper