The token embeddings are missing in the safetensor version of the large model

#3
by stefina - opened

Hi,
If I'm right the "roberta.embeddings.word_embeddings.weight" layer is missing in this last safetensor version of the large model.
Using from_pretrained for a downstream task leads to a random initialization of the embeddings dictionary... and a very very hard finetuning process !
A turnaround consists in using from_pretrained with revision="df7dbf5" as parameter to download the last version of the model which includes the token embeddings weights.

Yep, I had the same issue, finetuning was impossible i had to revert to revision="df7dbf5". Thank you for the tip @stefina

ALMAnaCH (Inria) org

I just uploaded a new model.safetensors file which fixes the issues. Thanks for the notice.

wissamantoun changed discussion status to closed

Sign up or log in to comment