document-summarization

Running

App Files Files Community

pszemraj commited on Oct 5, 2022

Commit

11d8f98

•

1 Parent(s): 724c5ad

📝 add words

Browse files

Signed-off-by: peter szemraj <peterszemraj@gmail.com>

Files changed (1) hide show

app.py +5 -11

app.py CHANGED Viewed

@@ -86,8 +86,8 @@ def proc_submission(
     _summaries = summarize_via_tokenbatches(
         tr_in,
-        model_sm if model_size == "base" else model,
-        tokenizer_sm if model_size == "base" else tokenizer,
         batch_length=token_batch_length,
         **settings,
     )
@@ -211,7 +211,7 @@ if __name__ == "__main__":
         gr.Markdown("# Document Summarization with Long-Document Transformers")
         gr.Markdown(
-            "TODO: Add a description of the model and how it works, and a link to the paper"
         )
         with gr.Column():
@@ -223,7 +223,7 @@ if __name__ == "__main__":
                 with gr.Column(scale=0.5, variant='compact'):
                     model_size = gr.Radio(
-                        choices=["base", "large"], label="Model Variant", value="base"
                     )
                     num_beams = gr.Radio(
                         choices=[2, 3, 4],
@@ -308,13 +308,7 @@ if __name__ == "__main__":
         with gr.Column():
             gr.Markdown("### About the Model")
             gr.Markdown(
-                "- [This model](https://huggingface.co/pszemraj/led-large-book-summary) is a fine-tuned checkpoint of [allenai/led-large-16384](https://huggingface.co/allenai/led-large-16384) on the [BookSum dataset](https://arxiv.org/abs/2105.08209).The goal was to create a model that can generalize well and is useful in summarizing lots of text in academic and daily usage."
-            )
-            gr.Markdown(
-                "- The two most important parameters-empirically-are the `num_beams` and `token_batch_length`. However, increasing these will also increase the amount of time it takes to generate a summary. The `length_penalty` and `repetition_penalty` parameters are also important for the model to generate good summaries."
-            )
-            gr.Markdown(
-                "- The model can be used with tag [pszemraj/led-large-book-summary](https://huggingface.co/pszemraj/led-large-book-summary). See the model card for details on usage & a notebook for a tutorial."
             )
             gr.Markdown("---")

     _summaries = summarize_via_tokenbatches(
         tr_in,
+        model_sm if "base" in model_size.lower() else model,
+        tokenizer_sm if "base" in model_size.lower() else tokenizer,
         batch_length=token_batch_length,
         **settings,
     )
         gr.Markdown("# Document Summarization with Long-Document Transformers")
         gr.Markdown(
+            "This is an example use case for fine-tuned long document transformers. The model is trained on book summaries (via the BookSum dataset). The models in this demo are [LongT5-base](https://huggingface.co/pszemraj/long-t5-tglobal-base-16384-book-summary) and [Pegasus-X-Large](https://huggingface.co/pszemraj/pegasus-x-large-book-summary)."
         )
         with gr.Column():
                 with gr.Column(scale=0.5, variant='compact'):
                     model_size = gr.Radio(
+                        choices=["LongT5-base", "Pegasus-X-large"], label="Model Variant", value="base"
                     )
                     num_beams = gr.Radio(
                         choices=[2, 3, 4],
         with gr.Column():
             gr.Markdown("### About the Model")
             gr.Markdown(
+                "These models are fine-tuned on the [BookSum dataset](https://arxiv.org/abs/2105.08209).The goal was to create a model that can generalize well and is useful in summarizing lots of text in academic and daily usage."
             )
             gr.Markdown("---")