Any one who use the script in the Model Card for inference purpose?

#64
by disper84 - opened

I've tried the following script in the Model Card screenshot down below:

image.png

However, the model spits out nonsense words (down below), any thought on that?

image.png

I've tried deployed gemma-2b (not the gemma-2b-it) in Ollama, it works perfect, all responses are solid.

Don't know why this discrepancies happen.

the tokenizer you're using is not fixed correctly, try checking for PRs in the community tab here to see if someone has provided it

image.png

Hi @disper84 , Could you please try again and let us know if the issue still persists? Please try changing the max_length by setting some increased number, outputs = model.generate(**input_ids, max_length=200) to control the output generation length as the above warning states.

Also, Pretrained (PT) versions of the model are not trained on any specific tasks or instructions beyond the Gemma core data training set whcih might cause inconsistency in the output. You should not deploy these models in applications or use for inference without performing some tuning.

Sign up or log in to comment