generates non-sense response

by vasilee - opened Jun 8, 2023

Jun 8, 2023

did you test the result?
for me for the exact code from the model card (including prompt)
it answers with :
................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

and for input "how are you?"
it says:
oldsoldssabsarmsarmsarmsarmsarmsarmarmsarmsarmsarmsarmsarmsarmsarmsarmsarmsarmsarmsarmsarmsarmsrachrachsrachrachsrachrachrachrachrachrachrachrachrachrachrachrachrachrachrachrachrachrachrachrachrachrachrachrachrachrachrachrachrachrachrachrachrachrachrachrachrachrachrachrachrachrachrachrachrachrachrachrachrachrachrachrachrach rachrachrachrachrachrachrachrachrachrachrachrachrachrachrach rachrachrach

limcheekin

Owner Jun 9, 2023

Yes, thanks for raising the issue.

The same problem happened to limcheekin/mpt-7b-storywriter-ct2.
I not sure whether there's something missing in my conversion process or the models are not supported by CTranslate2.

You can try to run the conversion yourself and test it out. Let's me know if you managed to get it working in your PC.

limcheekin

Owner Jun 9, 2023

•

edited Jun 9, 2023

I suggest you try out the following repo (a different but similar model) if you don't want to do the model conversion yourself published by another CTranslate2 supporter:
https://huggingface.co/michaelfeil/ct2fast-RedPajama-INCITE-7B-Chat

vasilee changed discussion status to closed Jun 13, 2023

vasilee changed discussion status to open Jun 14, 2023

vasilee

Jun 14, 2023

•

edited Jun 14, 2023

I tried many small models <=7b, and WizardLM-7b and fastchat give me the best summaries and answers, but fastchat wins on a couple of things

license
flan-t5 models are really open-source (apache2), unlike llama models which are not available for commercial usage
context
flan-t5 models (including fastchat) seems not too have a problem with longer contexts than specified,
I used 3K+ tokens (which is more than 2048 specified in the config)
speed
maybe because it is smaller (only 3B), but it is faster than quantized 4-bit wizardlm-7b, especially when testing with longer contexts

so my question :)
are you going to upload a new working version?

PS: in my tests, flan-alpaca-xl is not on par with fastchat, so it is not an option even if they were trained from the same base model

vasilee

Jun 15, 2023

UPDATE:
flan-alpaca-gpt4-xl works as good as fastchat,
thanks for quantizing it!

limcheekin

Owner Jun 15, 2023

You're mostly welcome on the quantize version of the flan-alpaca-gpt4-xl. I glad that it help.

Thanks for sharing the testing outcomes of the fastchat-t5 model. It seems worth to take another look into the issue.
By the way, did you tried to run the conversion and quantization on fastchat-t5 yourself and test it out?

Created an issue regarding this matter at https://github.com/OpenNMT/CTranslate2/issues/1295

vasilee

Jun 15, 2023

•

edited Jun 15, 2023

after latest changes it seems to work, but it is appending a quote (`) to every word.

the response for "hi, how are you" is:

I?` I'm` good!` How` about` you?` How` are` you?

the response for translation to German is:

Die` Haus` ist` wunderbar.

After following the suggestion from that thread you posted, and switched the tokenizer to the one from flan-alpaca-gpt4-xl-ct2 it works fine

limcheekin

Owner Jun 16, 2023

•

edited Jun 16, 2023

Thanks for sharing the solution. I updated the repo to use the tokenizer of the flan-alpaca-gpt4-xl-ct2, no switching required.

Please verify and close the issue if there's no problem.

Thanks.

filipemesquita

Aug 15, 2023

The new tokenizer partially works but it doesn't recognize newlines (\n), which is expected from the original Flan T5 model. But the fastchat-t5 tokenizer uses a special encoding to represent newlines.

Example:

input_text = "line1\nline2"
Flan T5 tokenizer: ['▁line', '1', '▁line', '2', '</s>']
fastchat-t5-3b tokenizer: ['▁line', '1', '\n', '▁line', '2', '</s>']

filipemesquita

Aug 16, 2023

More information on how to use the fastchat tokenizer: https://github.com/OpenNMT/CTranslate2/issues/1220#issuecomment-1679749680

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment