AttributeError: 'BaichuanTokenizer' object has no attribute 'sp_model'

#18

by lucasjin - opened Oct 17, 2023

Discussion

lucasjin

Oct 17, 2023

AttributeError: 'BaichuanTokenizer' object has no attribute 'sp_model',
how to add support to latest transofmers? 4.34?

qiyue0623

Oct 17, 2023

transfomers 4.34 doesn't work for me either. Degrading to 4.33.1 works in my case

chuhac

Oct 20, 2023

Since vllm==0.2.1 requires transformers==4.34.1 support for mistral, I don't think downgrading is a good idea, could contributors fix this bug or tell me anything I could do for a temporary fix ?

chuhac

Oct 20, 2023

Since vllm==0.2.1 requires transformers==4.34.1 support for mistral, I don't think downgrading is a good idea, could contributors fix this bug or tell me anything I could do for a temporary fix ?

solved with reference: https://github.com/huggingface/transformers/issues/26340#issuecomment-1766794575 , this may fix this bug for now.

tobywang

Oct 30, 2023

Since vllm==0.2.1 requires transformers==4.34.1 support for mistral, I don't think downgrading is a good idea, could contributors fix this bug or tell me anything I could do for a temporary fix ?

solved with reference: https://github.com/huggingface/transformers/issues/26340#issuecomment-1766794575 , this may fix this bug for now.

It didn't work for me

shibing624

Nov 10, 2023

•

edited Nov 10, 2023

update
tokenization_baichuan.py :

https://github.com/huggingface/transformers/issues/26340
You should file an issue on the model repos and tell them to rearrange the tokenizer init so that self.sp_model is created before calling super().init()

unicough

Nov 26, 2023

•

edited Nov 26, 2023

this solved the problem for me, edit tokenization_baichuan.py, in __init__, find super().__init__ function call and move it to the end of __init__

player123187

Dec 7, 2023

I solved the problem By

pip install transformers==4.34.0
move super().init like this
self.vocab_file = vocab_file
self.add_bos_token = add_bos_token
self.add_eos_token = add_eos_token
self.sp_model = spm.SentencePieceProcessor(**self.sp_model_kwargs)
self.sp_model.Load(vocab_file)
super().init(
bos_token=bos_token,
eos_token=eos_token,
unk_token=unk_token,
pad_token=pad_token,
add_bos_token=add_bos_token,
add_eos_token=add_eos_token,
sp_model_kwargs=self.sp_model_kwargs,
clean_up_tokenization_spaces=clean_up_tokenization_spaces,
**kwargs,
)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment