The token embeddings are missing in the safetensor version of the large model
#3
by
stefina
- opened
Hi,
If I'm right the "roberta.embeddings.word_embeddings.weight" layer is missing in this last safetensor version of the large model.
Using from_pretrained for a downstream task leads to a random initialization of the embeddings dictionary... and a very very hard finetuning process !
A turnaround consists in using from_pretrained with revision="df7dbf5" as parameter to download the last version of the model which includes the token embeddings weights.
Yep, I had the same issue, finetuning was impossible i had to revert to revision="df7dbf5". Thank you for the tip @stefina
I just uploaded a new model.safetensors file which fixes the issues. Thanks for the notice.
wissamantoun
changed discussion status to
closed