google/gemma-2b · Any one who use the script in the Model Card for inference purpose?

disper84

Jun 18, 2024

I've tried the following script in the Model Card screenshot down below:

However, the model spits out nonsense words (down below), any thought on that?

I've tried deployed gemma-2b (not the gemma-2b-it) in Ollama, it works perfect, all responses are solid.

Don't know why this discrepancies happen.

Tonic

Jun 18, 2024

the tokenizer you're using is not fixed correctly, try checking for PRs in the community tab here to see if someone has provided it

Tonic

Jun 18, 2024

Renu11

Google org Oct 4, 2024

Hi @disper84 , Could you please try again and let us know if the issue still persists? Please try changing the max_length by setting some increased number, outputs = model.generate(**input_ids, max_length=200) to control the output generation length as the above warning states.

Also, Pretrained (PT) versions of the model are not trained on any specific tasks or instructions beyond the Gemma core data training set whcih might cause inconsistency in the output. You should not deploy these models in applications or use for inference without performing some tuning.