Spaces:

Gradio-Blocks
/

pokemon-move-generator-app

Running

App Files Files Community

arjunpatel commited on May 30, 2022

Commit

ef09c1c

•

1 Parent(s): 19c6000

Typos

Browse files

Files changed (1) hide show

gradio_demo.py +35 -37

gradio_demo.py CHANGED Viewed

@@ -18,15 +18,7 @@ seed_text = "This move is called "
 tf.random.set_seed(0)
-# need a function to sanitize imputs
-# - remove extra spaces
-# - make sure each word is capitalized
-# - format the moves such that it's clearer when each move is listed
-# - play with the max length parameter abit, and try to remove sentences that don't end in periods.
 def update_history(df, move_name, move_desc, generation, parameters):
-    # needs to format each move description with new lines to cut down on width
     new_row = [{"Move Name": move_name,
                 "Move Description": move_desc,
                 "Generation Type": generation,
@@ -84,7 +76,8 @@ with demo:
         decoding methods in the process! Each tab aims to explain each generation methodology available for the
         model. The dataframe below allows you to keep track of each move generated, to compare!""")
     gr.Markdown("<h3> How does text generation work? <h3>")
-    gr.Markdown("""Roughly, text generation models accept an input sequence of words (or parts of words, known as tokens.
                 These models then output a corresponding set of words or tokens. Given the input, the model
                 estimates the probability of another possible word or token appearing right after the given sequence. In
                 other words, the model estimates conditional probabilities and ranks them in order to generate sequences
@@ -121,18 +114,19 @@ with demo:
                 text_output_greedy = gr.Textbox(label="Move Description")
             text_button_greedy = gr.Button("Create my move!")
         with gr.TabItem("Beam Search"):
-            gr.Markdown("This tab lets you learn about using beam search!")
             gr.Markdown("""Beam search is an improvement on Greedy Search. Instead of directly grabbing the word that
             maximizes probability, we conduct a search with B number of candidates. We then try to find the next word
             that would most likely follow each beam, and we grab the top B candidates of that search. This may
             eliminate one of the original beams we started with, and that's okay! That is how the algorithm decides
-            on an optimal candidate. Eventually, the beam sequence terminate or are eliminated due to being too improbale.
-            Increasing the number of beams will increase model generation time, but also result in a more thorough search.
-            Decreasing the number of beams will decrease decoding time, but it may not find an optimal sentence.
             Play around with the num_beams parameter to experiment! """
-            )
             with gr.Row():
                 num_beams = gr.Slider(minimum=2, maximum=10, value=2, step=1,
                                       label="Number of Beams")
@@ -140,21 +134,23 @@ with demo:
                 text_output_beam = gr.Textbox(label="Move Description")
             text_button_beam = gr.Button("Create my move!")
         with gr.TabItem("Sampling and Temperature Search"):
-            gr.Markdown("This tab lets you experiment with adjusting the temperature of the generator")
             gr.Markdown(
-                """
-                Greedy Search and Beam Search were both good at finding sequences that are likely to follow our input text,
-                but when generating cool move descriptions, we want some more variety!
                 Instead of choosing the word or token that is most likely to follow a given sequence, we can instead
-                ask the model to sample across the probability distribution of likely words. It's kind of like walking
-                into the tall grass and finding a Pokemon encounter. There are different encounter rates, which allow
                 for the most common mons to appear (looking at you, Zubat), but also account for surprise, like shinys!
-                We might even want to go further, though. We can rescale the probability distributions directly instead,
-                allowing for rare words to temporarily become more frequently. We do this using the temperature parameter.
-                Turn the temperature up, and rare tokens become very likely! Cool down, and we approach more sensible output.
                 Experiment with turning sampling on and off, and by varying temperature below!.
                 """)
@@ -168,12 +164,12 @@ with demo:
             text_button_temp = gr.Button("Create my move!")
         with gr.TabItem("Top K and Top P Sampling"):
             gr.Markdown(
-                """
-                When we want more control over the words we get to sample from, we turn to Top K and Top P decoding methods!
-                The Top K sampling method selects the K most probable words given a sequence, and then samples from that subset,
-                rather than the whole vocabulary. This effectively cuts out low probability words.
                 Top P also reduces the available vocabulary to sample from, but instead of choosing the number of
@@ -198,24 +194,26 @@ with demo:
         gr.Markdown("<h3> Generation History <h3>")
         # Displays a dataframe with the history of moves generated, with parameters
         history = gr.Dataframe(headers=["Move Name", "Move Description", "Generation Type", "Parameters"])
-    with gr.Row():
         gr.Markdown("<h3>How did you make this?<h3>")
         gr.Markdown("""
-        I collected the dataset from Serebii (https://www.serebii.net) , a news source and aggregator of Pokemon info.
-        I then added a seed phrase  "This move is called" just before each move in order to assist the model in generation.
-        I then followed HuggingFace's handy language_modeling.ipynb for fine-tuning distillgpt2 on this tiny dataset, and
-        it surprisingly worked!
-        I learned all about text generation using the book Natural Language Processing with Transformers by  Lewis Turnstall,
-        Leandro von Werra and  Thomas Wolf, as well as this fantastic article (https://huggingface.co/blog/how-to-generate) by
-        Patrick von Platen. Thanks to all of these folks for creating these learning materials, and thanks to the
-        Hugging Face team for developing this product!
-        """)
     text_button_baseline.click(create_move, inputs=[text_input_baseline, history],
                                outputs=[text_output_baseline, history])
     text_button_greedy.click(create_greedy_search_move, inputs=[text_input_greedy, history],

 tf.random.set_seed(0)
 def update_history(df, move_name, move_desc, generation, parameters):
     new_row = [{"Move Name": move_name,
                 "Move Description": move_desc,
                 "Generation Type": generation,
         decoding methods in the process! Each tab aims to explain each generation methodology available for the
         model. The dataframe below allows you to keep track of each move generated, to compare!""")
     gr.Markdown("<h3> How does text generation work? <h3>")
+    gr.Markdown("""Roughly, text generation models accept an input sequence of words (or parts of words,
+    known as tokens.
                 These models then output a corresponding set of words or tokens. Given the input, the model
                 estimates the probability of another possible word or token appearing right after the given sequence. In
                 other words, the model estimates conditional probabilities and ranks them in order to generate sequences
                 text_output_greedy = gr.Textbox(label="Move Description")
             text_button_greedy = gr.Button("Create my move!")
         with gr.TabItem("Beam Search"):
             gr.Markdown("""Beam search is an improvement on Greedy Search. Instead of directly grabbing the word that
             maximizes probability, we conduct a search with B number of candidates. We then try to find the next word
             that would most likely follow each beam, and we grab the top B candidates of that search. This may
             eliminate one of the original beams we started with, and that's okay! That is how the algorithm decides
+            on an optimal candidate. Eventually, the beam sequence terminate or are eliminated due to being too
+            improbable.
+            Increasing the number of beams will increase model generation time, but also result in a more thorough
+            search. Decreasing the number of beams will decrease decoding time, but it may not find an optimal
+            sentence.
             Play around with the num_beams parameter to experiment! """
+                        )
             with gr.Row():
                 num_beams = gr.Slider(minimum=2, maximum=10, value=2, step=1,
                                       label="Number of Beams")
                 text_output_beam = gr.Textbox(label="Move Description")
             text_button_beam = gr.Button("Create my move!")
         with gr.TabItem("Sampling and Temperature Search"):
             gr.Markdown(
+                """Greedy Search and Beam Search were both good at finding sequences that are likely to follow our
+                input text, but when generating cool move descriptions, we want some more variety!
                 Instead of choosing the word or token that is most likely to follow a given sequence, we can instead
+                ask the model to sample across the probability distribution of likely words.
+                It's kind of like walking into the tall grass and finding a Pokemon encounter.
+                There are different encounter rates, which allow
                 for the most common mons to appear (looking at you, Zubat), but also account for surprise, like shinys!
+                We might even want to go further, though. We can rescale the probability distributions directly
+                instead, allowing for rare words to temporarily become more frequently. We do this using the
+                temperature parameter.
+                Turn the temperature up, and rare tokens become very likely! Cool down, and we approach more sensible
+                output.
                 Experiment with turning sampling on and off, and by varying temperature below!.
                 """)
             text_button_temp = gr.Button("Create my move!")
         with gr.TabItem("Top K and Top P Sampling"):
             gr.Markdown(
+                """When we want more control over the words we get to sample from, we turn to Top K and Top P
+                decoding methods!
+                The Top K sampling method selects the K most probable words given a sequence, and then samples from
+                that subset, rather than the whole vocabulary. This effectively cuts out low probability words.
                 Top P also reduces the available vocabulary to sample from, but instead of choosing the number of
         gr.Markdown("<h3> Generation History <h3>")
         # Displays a dataframe with the history of moves generated, with parameters
         history = gr.Dataframe(headers=["Move Name", "Move Description", "Generation Type", "Parameters"])
+    with gr.Box():
         gr.Markdown("<h3>How did you make this?<h3>")
         gr.Markdown("""
+        I collected the dataset from [Serebii] (https://www.serebii.net) , a news source and aggregator of Pokemon info.
+        I then added a seed phrase  "This move is called" just before each move in order to assist the model in
+        generation.
+        I then followed HuggingFace's handy language_modeling.ipynb for fine-tuning distillgpt2 on this tiny dataset,
+        and it surprisingly worked!
+        I learned all about text generation using the book [Natural Language Processing with Transformers] (
+        https://www.oreilly.com/library/view/natural-language-processing/9781098103231/) by Lewis Tunstall,
+        Leandro von Werra and Thomas Wolf, as well as [this fantastic article] (
+        https://huggingface.co/blog/how-to-generate) by Patrick von Platen. Thanks to all of these folks for creating
+        these learning materials, and thanks to the Hugging Face team for developing this product! """)
     text_button_baseline.click(create_move, inputs=[text_input_baseline, history],
                                outputs=[text_output_baseline, history])
     text_button_greedy.click(create_greedy_search_move, inputs=[text_input_greedy, history],