Two Pad Tokens?

#5
by lihaoxin2020 - opened

Hi,

Thank you for your contribution and sharing this!
I noticed there is a new token special token [PAD] indexed 32000 added to the tokenizer specifically for the instr finetuned model. Seems like the model already has a token indexed 0.
What's the point of adding the new padding token?

Thanks in advance!

lihaoxin2020 changed discussion status to closed

actually no. Now i see. the padding_idx=0 is set in the generation_config. the original model itself doesn't have a padding token.

Sign up or log in to comment