pszemraj commited on
Commit
916f36c
โ€ข
1 Parent(s): 9bc2923

Signed-off-by: peter szemraj <peterszemraj@gmail.com>

Files changed (2) hide show
  1. app.py +2 -2
  2. summarize.py +1 -1
app.py CHANGED
@@ -267,8 +267,8 @@ if __name__ == "__main__":
267
  "- [This model](https://huggingface.co/pszemraj/led-large-book-summary) is a fine-tuned checkpoint of [allenai/led-large-16384](https://huggingface.co/allenai/led-large-16384) on the [BookSum dataset](https://arxiv.org/abs/2105.08209).The goal was to create a model that can generalize well and is useful in summarizing lots of text in academic and daily usage."
268
  )
269
  gr.Markdown(
270
- "- The two most important parameters-empirically-are the `num_beams` and `token_batch_length`. However, increasing these will also increase the amount of time it takes to generate a summary. The `length_penalty` and `repetition_penalty` parameters are also important for the model to generate good summaries."
271
- )
272
  gr.Markdown(
273
  "- The model can be used with tag [pszemraj/led-large-book-summary](https://huggingface.co/pszemraj/led-large-book-summary). See the model card for details on usage & a notebook for a tutorial."
274
  )
 
267
  "- [This model](https://huggingface.co/pszemraj/led-large-book-summary) is a fine-tuned checkpoint of [allenai/led-large-16384](https://huggingface.co/allenai/led-large-16384) on the [BookSum dataset](https://arxiv.org/abs/2105.08209).The goal was to create a model that can generalize well and is useful in summarizing lots of text in academic and daily usage."
268
  )
269
  gr.Markdown(
270
+ "- The two most important parameters-empirically-are the `num_beams` and `token_batch_length`. "
271
+ )
272
  gr.Markdown(
273
  "- The model can be used with tag [pszemraj/led-large-book-summary](https://huggingface.co/pszemraj/led-large-book-summary). See the model card for details on usage & a notebook for a tutorial."
274
  )
summarize.py CHANGED
@@ -83,7 +83,7 @@ def summarize_via_tokenbatches(
83
 
84
  Args:
85
  input_text (str): the text to summarize
86
- model (): the model to use for summarization
87
  tokenizer (): the tokenizer to use for summarization
88
  batch_length (int, optional): the length of each batch. Defaults to 2048.
89
  batch_stride (int, optional): the stride of each batch. Defaults to 16. The stride is the number of tokens that overlap between batches.
 
83
 
84
  Args:
85
  input_text (str): the text to summarize
86
+ model (): the model to use for summarizationz
87
  tokenizer (): the tokenizer to use for summarization
88
  batch_length (int, optional): the length of each batch. Defaults to 2048.
89
  batch_stride (int, optional): the stride of each batch. Defaults to 16. The stride is the number of tokens that overlap between batches.