Missing `pad_token_id` in config
I am trying to use this model with text-embeddings-inference
. It fails to load the model with this error:
Error: Failed to parse `config.json`
Caused by:
missing field `pad_token_id` at line 54 column 1
While I have your attention: I want an open weights embedding model with max sequence length >= 1024 and a size < 1Gb. This is the first model I found that meets that criteria. I would be using the model for RAG semantic search. Is there any reason why I might want to find another model? Performance perhaps?
Yeah it doesn't work with text-embeddings-inference
cuz huggingface needs to update to the masters branch of sentence transformers I think;
Hmm often I find that truncating to e.g. 1024 or 512 is just as good as using the full embedding
Do you know if adding pad_token_id
might be sufficient as a workaround to make the model usable with text-embeddings-inference
now?
idk sorry