ValueError: Couldn't instantiate the backend tokenizer

#1
by brrbaral - opened

HI,
I am unable to use the model. I tried using the same code from the documentation. I am using google colab for execution.

Code:
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
model_name = "GOAT-AI/GOAT-7B-Community"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16
)

Error:
ValueError: Couldn't instantiate the backend tokenizer from one of:
(1) a tokenizers library serialization file,
(2) a slow tokenizer instance to convert or
(3) an equivalent slow tokenizer class to instantiate and convert.
You need to have sentencepiece installed to convert a slow tokenizer to a fast one.

GOAT.AI org

Hi, you forgot to install sentencepiece package, see the last line: "You need to have sentencepiece installed to convert a slow tokenizer to a fast one."

bungbae changed discussion status to closed

Sign up or log in to comment