--- library_name: transformers tags: - synthetic - '16384' license: apache-2.0 datasets: - BEE-spoke-data/synthsumm-open-v1.0 language: - en base_model: - google/pegasus-x-base pipeline_tag: summarization --- # pegasus-x-base-synthsumm_open-16k

This is a text-to-text summarization model fine-tuned from [pegasus-x-base](https://hf.co/google/pegasus-x-base) on a dataset of long documents from various sources/domains and their synthetic summaries. It performs surprisingly well as a general summarization model for its size. More details, a larger model, and the dataset will be released (_as time permits_). ## Usage It's recommended to use this model with [beam search decoding](https://huggingface.co/docs/transformers/generation_strategies#beamsearch-decoding). If interested, you can also use the [textsum](https://github.com/pszemraj/textsum) util package to have most of this abstracted out for you: ```bash pip install -U textsum ``` then: ```python from textsum.summarize import Summarizer model_name = "BEE-spoke-data/pegasus-x-base-synthsumm_open-16k" summarizer = Summarizer(model_name) # GPU auto-detected text = "put the text you don't want to read here" summary = summarizer.summarize_string(text) print(summary) ```