Spaces:
Sleeping
Sleeping
jwkirchenbauer
commited on
Commit
·
0a88363
1
Parent(s):
8ec4512
bold settings name descriptions
Browse files- demo_watermark.py +10 -10
demo_watermark.py
CHANGED
@@ -579,26 +579,26 @@ def run_gradio(args, model=None, device=None, tokenizer=None):
|
|
579 |
"""
|
580 |
#### Generation Parameters:
|
581 |
|
582 |
-
- Decoding Method : We can generate tokens from the model using either multinomial sampling or we can generate using greedy decoding.
|
583 |
-
- Sampling Temperature : If using multinomial sampling we can set the temperature of the sampling distribution.
|
584 |
0.0 is equivalent to greedy decoding, and 1.0 is the maximum amount of variability/entropy in the next token distribution.
|
585 |
0.7 strikes a nice balance between faithfulness to the model's estimate of top candidates while adding variety. Does not apply for greedy decoding.
|
586 |
-
- Generation Seed : The integer to pass to the torch random number generator before running generation. Makes the multinomial sampling strategy
|
587 |
outputs reproducible. Does not apply for greedy decoding.
|
588 |
-
- Number of Beams : When using greedy decoding, we can also set the number of beams to > 1 to enable beam search.
|
589 |
This is not implemented/excluded from paper for multinomial sampling but may be added in future.
|
590 |
-
- Max Generated Tokens : The `max_new_tokens` parameter passed to the generation method to stop the output at a certain number of new tokens.
|
591 |
Note that the model is free to generate fewer tokens depending on the prompt.
|
592 |
Implicitly this sets the maximum number of prompt tokens possible as the model's maximum input length minus `max_new_tokens`,
|
593 |
and inputs will be truncated accordingly.
|
594 |
|
595 |
#### Watermark Parameters:
|
596 |
|
597 |
-
- gamma : The fraction of the vocabulary to be partitioned into the greenlist at each generation step.
|
598 |
Smaller gamma values create a stronger watermark by enabling the watermarked model to achieve
|
599 |
a greater differentiation from human/unwatermarked text because it is preferentially sampling
|
600 |
from a smaller green set making those tokens less likely to occur by chance.
|
601 |
-
- delta : The amount of positive bias to add to the logits of every token in the greenlist
|
602 |
at each generation step before sampling/choosing the next token. Higher delta values
|
603 |
mean that the greenlist tokens are more heavily preferred by the watermarked model
|
604 |
and as the bias becomes very large the watermark transitions from "soft" to "hard".
|
@@ -607,7 +607,7 @@ def run_gradio(args, model=None, device=None, tokenizer=None):
|
|
607 |
|
608 |
#### Detector Parameters:
|
609 |
|
610 |
-
- z-score threshold : the z-score cuttoff for the hypothesis test. Higher thresholds (such as 4.0) make
|
611 |
_false positives_ (predicting that human/unwatermarked text is watermarked) very unlikely
|
612 |
as a genuine human text with a significant number of tokens will almost never achieve
|
613 |
that high of a z-score. Lower thresholds will capture more _true positives_ as some watermarked
|
@@ -615,11 +615,11 @@ def run_gradio(args, model=None, device=None, tokenizer=None):
|
|
615 |
be flagged as "watermarked". However, a lowere threshold will increase the chance that human text
|
616 |
that contains a slightly higher than average number of green tokens is erroneously flagged.
|
617 |
4.0-5.0 offers extremely low false positive rates while still accurately catching most watermarked text.
|
618 |
-
- Ignore Bigram Repeats : This alternate detection algorithm only considers the unique bigrams in the text during detection,
|
619 |
computing the greenlists based on the first in each pair and checking whether the second falls within the list.
|
620 |
This means that `T` is now the unique number of bigrams in the text, which becomes less than the total
|
621 |
number of tokens generated if the text contains a lot of repetition. See the paper for a more detailed discussion.
|
622 |
-
- Normalizations : we implement a few basic normaliations to defend against various adversarial perturbations of the
|
623 |
text analyzed during detection. Currently we support converting all chracters to unicode,
|
624 |
replacing homoglyphs with a canonical form, and standardizing the capitalization.
|
625 |
See the paper for a detailed discussion of input normalization.
|
|
|
579 |
"""
|
580 |
#### Generation Parameters:
|
581 |
|
582 |
+
- **Decoding Method** : We can generate tokens from the model using either multinomial sampling or we can generate using greedy decoding.
|
583 |
+
- **Sampling Temperature** : If using multinomial sampling we can set the temperature of the sampling distribution.
|
584 |
0.0 is equivalent to greedy decoding, and 1.0 is the maximum amount of variability/entropy in the next token distribution.
|
585 |
0.7 strikes a nice balance between faithfulness to the model's estimate of top candidates while adding variety. Does not apply for greedy decoding.
|
586 |
+
- **Generation Seed** : The integer to pass to the torch random number generator before running generation. Makes the multinomial sampling strategy
|
587 |
outputs reproducible. Does not apply for greedy decoding.
|
588 |
+
- **Number of Beams** : When using greedy decoding, we can also set the number of beams to > 1 to enable beam search.
|
589 |
This is not implemented/excluded from paper for multinomial sampling but may be added in future.
|
590 |
+
- **Max Generated Tokens** : The `max_new_tokens` parameter passed to the generation method to stop the output at a certain number of new tokens.
|
591 |
Note that the model is free to generate fewer tokens depending on the prompt.
|
592 |
Implicitly this sets the maximum number of prompt tokens possible as the model's maximum input length minus `max_new_tokens`,
|
593 |
and inputs will be truncated accordingly.
|
594 |
|
595 |
#### Watermark Parameters:
|
596 |
|
597 |
+
- **gamma** : The fraction of the vocabulary to be partitioned into the greenlist at each generation step.
|
598 |
Smaller gamma values create a stronger watermark by enabling the watermarked model to achieve
|
599 |
a greater differentiation from human/unwatermarked text because it is preferentially sampling
|
600 |
from a smaller green set making those tokens less likely to occur by chance.
|
601 |
+
- **delta** : The amount of positive bias to add to the logits of every token in the greenlist
|
602 |
at each generation step before sampling/choosing the next token. Higher delta values
|
603 |
mean that the greenlist tokens are more heavily preferred by the watermarked model
|
604 |
and as the bias becomes very large the watermark transitions from "soft" to "hard".
|
|
|
607 |
|
608 |
#### Detector Parameters:
|
609 |
|
610 |
+
- **z-score threshold** : the z-score cuttoff for the hypothesis test. Higher thresholds (such as 4.0) make
|
611 |
_false positives_ (predicting that human/unwatermarked text is watermarked) very unlikely
|
612 |
as a genuine human text with a significant number of tokens will almost never achieve
|
613 |
that high of a z-score. Lower thresholds will capture more _true positives_ as some watermarked
|
|
|
615 |
be flagged as "watermarked". However, a lowere threshold will increase the chance that human text
|
616 |
that contains a slightly higher than average number of green tokens is erroneously flagged.
|
617 |
4.0-5.0 offers extremely low false positive rates while still accurately catching most watermarked text.
|
618 |
+
- **Ignore Bigram Repeats** : This alternate detection algorithm only considers the unique bigrams in the text during detection,
|
619 |
computing the greenlists based on the first in each pair and checking whether the second falls within the list.
|
620 |
This means that `T` is now the unique number of bigrams in the text, which becomes less than the total
|
621 |
number of tokens generated if the text contains a lot of repetition. See the paper for a more detailed discussion.
|
622 |
+
- **Normalizations** : we implement a few basic normaliations to defend against various adversarial perturbations of the
|
623 |
text analyzed during detection. Currently we support converting all chracters to unicode,
|
624 |
replacing homoglyphs with a canonical form, and standardizing the capitalization.
|
625 |
See the paper for a detailed discussion of input normalization.
|