π© Report
The model is bugged and is not responding okay. Here's an example:
Input:
"LongT5 model is an encoder-decoder transformer pre-trained in a text-to-text denoising generative setting (Pegasus-like generation pre-training). LongT5 model is an extension of T5 model, and it enables using one of the two different efficient attention mechanisms - (1) Local attention, or (2) Transient-Global attention. The usage of attention sparsity patterns allows the model to efficiently handle input sequence.
LongT5 is particularly effective when fine-tuned for text generation (summarization, question answering) which requires handling long input sequences (up to 16,384 tokens)."
Output:
"matematic matematic orchid orchid orchid orchid orchid orchid orchid orchid orchid orchid orchid orchid orchid orchid orchid orchid orchid"
The same happens with other examples.
I am also having a similar issue. On both fine-tuned and base versions of this model, I get output on a simple summarization task like the following:
" informal informal Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt"
" the a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a"
I am also having a similar issue. On both fine-tuned and base versions of this model, I get output on a simple summarization task like the following:
" informal informal Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt Kontakt"
" the a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a"
Which precision are you running the model at? I have noticed some issues when trying to load it in 16-bit, maybe that might be an issue? Try to run it with torch_dtype=torch.float32
or just omit the dtype argument when loading the model.
I am also facing a similar issue, I did not specify the precision setting (torch_dtype) explicitly when loading the model, so that the model will use its default precision, which is typically float32. But still the issue exists.
is the model really bugged or were you able to overcome this issue?
Final Output:
GuGuGuGuGuGuGuGuGuGuGuGuchiechiechiechiechiechiechiechiechiechiechiechiechiechiechie reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot reboot