summarize-long-text

Sleeping

pszemraj commited on Oct 5, 2022

Commit

916f36c

•

1 Parent(s): 9bc2923

📝

Signed-off-by: peter szemraj <peterszemraj@gmail.com>

Files changed (2) hide show

app.py CHANGED Viewed

@@ -267,8 +267,8 @@ if __name__ == "__main__":
                 "- [This model](https://huggingface.co/pszemraj/led-large-book-summary) is a fine-tuned checkpoint of [allenai/led-large-16384](https://huggingface.co/allenai/led-large-16384) on the [BookSum dataset](https://arxiv.org/abs/2105.08209).The goal was to create a model that can generalize well and is useful in summarizing lots of text in academic and daily usage."
             )
             gr.Markdown(
-                "- The two most important parameters-empirically-are the `num_beams` and `token_batch_length`. However, increasing these will also increase the amount of time it takes to generate a summary. The `length_penalty` and `repetition_penalty` parameters are also important for the model to generate good summaries."
-            )
             gr.Markdown(
                 "- The model can be used with tag [pszemraj/led-large-book-summary](https://huggingface.co/pszemraj/led-large-book-summary). See the model card for details on usage & a notebook for a tutorial."
             )

                 "- [This model](https://huggingface.co/pszemraj/led-large-book-summary) is a fine-tuned checkpoint of [allenai/led-large-16384](https://huggingface.co/allenai/led-large-16384) on the [BookSum dataset](https://arxiv.org/abs/2105.08209).The goal was to create a model that can generalize well and is useful in summarizing lots of text in academic and daily usage."
             )
             gr.Markdown(
+                "- The two most important parameters-empirically-are the `num_beams` and `token_batch_length`.  "
+                )
             gr.Markdown(
                 "- The model can be used with tag [pszemraj/led-large-book-summary](https://huggingface.co/pszemraj/led-large-book-summary). See the model card for details on usage & a notebook for a tutorial."
             )

summarize.py CHANGED Viewed

@@ -83,7 +83,7 @@ def summarize_via_tokenbatches(
     Args:
         input_text (str): the text to summarize
-        model (): the model to use for summarization
         tokenizer (): the tokenizer to use for summarization
         batch_length (int, optional): the length of each batch. Defaults to 2048.
         batch_stride (int, optional): the stride of each batch. Defaults to 16. The stride is the number of tokens that overlap between batches.

     Args:
         input_text (str): the text to summarize
+        model (): the model to use for summarizationz
         tokenizer (): the tokenizer to use for summarization
         batch_length (int, optional): the length of each batch. Defaults to 2048.
         batch_stride (int, optional): the stride of each batch. Defaults to 16. The stride is the number of tokens that overlap between batches.