AttributeError: 'BaichuanTokenizer' object has no attribute 'sp_model'

#18
by lucasjin - opened

AttributeError: 'BaichuanTokenizer' object has no attribute 'sp_model',
how to add support to latest transofmers? 4.34?

transfomers 4.34 doesn't work for me either. Degrading to 4.33.1 works in my case

Since vllm==0.2.1 requires transformers==4.34.1 support for mistral, I don't think downgrading is a good idea, could contributors fix this bug or tell me anything I could do for a temporary fix ?

Since vllm==0.2.1 requires transformers==4.34.1 support for mistral, I don't think downgrading is a good idea, could contributors fix this bug or tell me anything I could do for a temporary fix ?

solved with reference: https://github.com/huggingface/transformers/issues/26340#issuecomment-1766794575 , this may fix this bug for now.

Since vllm==0.2.1 requires transformers==4.34.1 support for mistral, I don't think downgrading is a good idea, could contributors fix this bug or tell me anything I could do for a temporary fix ?

solved with reference: https://github.com/huggingface/transformers/issues/26340#issuecomment-1766794575 , this may fix this bug for now.

It didn't work for me

update
tokenization_baichuan.py :

https://github.com/huggingface/transformers/issues/26340
You should file an issue on the model repos and tell them to rearrange the tokenizer init so that self.sp_model is created before calling super().init()

this solved the problem for me, edit tokenization_baichuan.py, in __init__, find super().__init__ function call and move it to the end of __init__

I solved the problem By

  1. pip install transformers==4.34.0
  2. move super().init like this
    self.vocab_file = vocab_file
    self.add_bos_token = add_bos_token
    self.add_eos_token = add_eos_token
    self.sp_model = spm.SentencePieceProcessor(**self.sp_model_kwargs)
    self.sp_model.Load(vocab_file)
    super().init(
    bos_token=bos_token,
    eos_token=eos_token,
    unk_token=unk_token,
    pad_token=pad_token,
    add_bos_token=add_bos_token,
    add_eos_token=add_eos_token,
    sp_model_kwargs=self.sp_model_kwargs,
    clean_up_tokenization_spaces=clean_up_tokenization_spaces,
    **kwargs,
    )

Sign up or log in to comment