使用 AutoTokenizer 出現錯誤

#2
by OwenLegalSign - opened

我嘗試使用 AutoTokenzier, 但遇到以下的錯誤

docker-llm-1    |   File "/app/app.py", line 15, in load_model
docker-llm-1    |     tokenizer = AutoTokenizer.from_pretrained("INX-TEXT/Bailong-instruct-7B")
docker-llm-1    |   File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/tokenization_auto.py", line 814, in from_pretrained
docker-llm-1    |     return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
docker-llm-1    |   File "/usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_base.py", line 2029, in from_pretrained
docker-llm-1    |     return cls._from_pretrained(
docker-llm-1    |   File "/usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_base.py", line 2261, in _from_pretrained
docker-llm-1    |     tokenizer = cls(*init_inputs, **init_kwargs)
docker-llm-1    |   File "/usr/local/lib/python3.10/dist-packages/transformers/models/llama/tokenization_llama_fast.py", line 124, in __init__
docker-llm-1    |     super().__init__(
docker-llm-1    |   File "/usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_fast.py", line 120, in __init__
docker-llm-1    |     raise ValueError(
docker-llm-1    | ValueError: Couldn't instantiate the backend tokenizer from one of: 
docker-llm-1    | (1) a `tokenizers` library serialization file, 
docker-llm-1    | (2) a slow tokenizer instance to convert or 
docker-llm-1    | (3) an equivalent slow tokenizer class to instantiate and convert. 
docker-llm-1    | You need to have sentencepiece installed to convert a slow tokenizer to a fast one.

我的 transformer=4.37.2 想請問上面的問題該如何解決 謝謝

INX-TEXT-AI org
edited Feb 16

@OwenLegalSign

您好,我剛才在更新至transformers=4.37.2後測試了一下,沒有發生您所示的錯誤。您是否可以幫我試試看更新或安裝sentencepiece後再重試一次,因為您提供的錯誤碼在最後一行有寫:"You need to have sentencepiece installed to convert a slow tokenizer to a fast one.",感謝您。

Blaze7451 changed discussion status to closed

謝謝,經安裝 sentencepiece 後就正常

Sign up or log in to comment