The token embeddings are missing in the safetensor version of the large model

by stefina - opened Jun 27, 2024

Jun 27, 2024

Hi,
If I'm right the "roberta.embeddings.word_embeddings.weight" layer is missing in this last safetensor version of the large model.
Using from_pretrained for a downstream task leads to a random initialization of the embeddings dictionary... and a very very hard finetuning process !
A turnaround consists in using from_pretrained with revision="df7dbf5" as parameter to download the last version of the model which includes the token embeddings weights.

Alnine

Jul 15, 2024

•

edited Jul 15, 2024

Yep, I had the same issue, finetuning was impossible i had to revert to revision="df7dbf5". Thank you for the tip @stefina

wissamantoun

ALMAnaCH (Inria) org Jul 21, 2024

I just uploaded a new model.safetensors file which fixes the issues. Thanks for the notice.

wissamantoun changed discussion status to closed Jul 21, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment