Spaces:

tomg-group-umd
/

lm-watermarking

Sleeping

App Files Files Community

jwkirchenbauer commited on Mar 11, 2023

Commit

8ec4512

•

1 Parent(s): 5b3e92c

update framework acknowledgement

Browse files

Files changed (1) hide show

demo_watermark.py +17 -7

demo_watermark.py CHANGED Viewed

@@ -411,6 +411,8 @@ def run_gradio(args, model=None, device=None, tokenizer=None):
                 gr.Markdown(
                 """
                 ## 💧 [A Watermark for Large Language Models](https://arxiv.org/abs/2301.10226) 🔍
                 """
                 )
             with gr.Column(scale=1):
@@ -445,18 +447,27 @@ def run_gradio(args, model=None, device=None, tokenizer=None):
                         was likely to have been generated by a model that uses the watermark.
                         This space showcases a watermarking approach that can be applied to _any_ generative language model.
-                        For demonstration purposes, the space serves a "small" multi-billion parameter model (see the following note for caveats due to small size).
                         """
                         )
-                    with gr.Accordion("A note on model generation quality",open=False):
                         gr.Markdown(
                             """
-                            This demo uses open-source language models that fit on a single GPU. These models are less powerful than proprietary commercial tools like ChatGPT, Claude, or Bard.
-                            Importantly, we use a language model that is designed to "complete" your prompt, and not a model this is fine-tuned to follow instructions.
-                            For best results, prompt the model with a few sentences that form the beginning of a paragraph, and then allow it to "continue" your paragraph.
                             Some examples include the opening paragraph of a wikipedia article, or the first few sentences of a story.
                             Longer prompts that end mid-sentence will result in more fluent generations.
                             """
                             )
                     gr.Markdown(
@@ -769,7 +780,6 @@ def run_gradio(args, model=None, device=None, tokenizer=None):
         select_green_tokens.change(fn=detect_partial, inputs=[output_with_watermark,session_args,session_tokenizer], outputs=[with_watermark_detection_result,session_args,session_tokenizer])
         select_green_tokens.change(fn=detect_partial, inputs=[detection_input,session_args,session_tokenizer], outputs=[detection_result,session_args,session_tokenizer])
     demo.queue(concurrency_count=3)
     if args.demo_public:

                 gr.Markdown(
                 """
                 ## 💧 [A Watermark for Large Language Models](https://arxiv.org/abs/2301.10226) 🔍
+                Demo made possible by the HuggingFace 🤗 [text-generation-inference](https://github.com/huggingface/text-generation-inference) serving framework.
                 """
                 )
             with gr.Column(scale=1):
                         was likely to have been generated by a model that uses the watermark.
                         This space showcases a watermarking approach that can be applied to _any_ generative language model.
+                        For demonstration purposes, the space demos a selection of multi-billion parameter models (see the following note for caveats).
                         """
                         )
+                    with gr.Accordion("A note on the available models:",open=False):
                         gr.Markdown(
                             """
+                            This demo uses open-source language models. Today, these models are less powerful than proprietary commercial tools like ChatGPT, Claude, Bard, or Bing/Sydney.
+                            Smaller models like OPT-6.7b are designed to "complete" your prompt, and are not fine-tuned to follow instructions.
+                            For best results, prompt that model with a few sentences that form the beginning of a paragraph, and then allow it to "continue" your paragraph.
                             Some examples include the opening paragraph of a wikipedia article, or the first few sentences of a story.
                             Longer prompts that end mid-sentence will result in more fluent generations.
+                            The larger models available in this demo are fine-tuned to follow instructions but have different strengths and will showcase different
+                            types of watermark behavior. [BLOOMZ (175B)](https://huggingface.co/bigscience/bloomz) is an instruction tuned variant of BLOOM capable of following instructions in dozens of languages zero-shot
+                            and can generate long and coherent paragraphs and stories given the right prompt.
+                            The FLAN models [FLAN-t5-xxl (11B)](https://huggingface.co/google/flan-t5-xxl) and [FLAN-UL2 (20B)](https://huggingface.co/google/flan-ul2) are fine-tuned on a variety of in-context few-shot learning NLP tasks,
+                            such as reasoning, and question answering.
+                            Generally, short, low entropy scenarios where the model has very few choices in terms of correct/suitable responses to the prompt
+                            will not exhibit as strong of a watermark presence, while longer watermarked outputs will produce higher detection statistics.
                             """
                             )
                     gr.Markdown(
         select_green_tokens.change(fn=detect_partial, inputs=[output_with_watermark,session_args,session_tokenizer], outputs=[with_watermark_detection_result,session_args,session_tokenizer])
         select_green_tokens.change(fn=detect_partial, inputs=[detection_input,session_args,session_tokenizer], outputs=[detection_result,session_args,session_tokenizer])
     demo.queue(concurrency_count=3)
     if args.demo_public: