Fix various snippets; add required safe_serialization
#2
by
tomaarsen
HF staff
- opened
Hello!
Pull Request overview
- Fix various snippets: point to
nomic-ai/nomic-embed-text-v1.5
rather than"."
ornomic-ai/nomic-embed-text-v1
. - Add
safe_serialization=True
.
Details
The serialization parameter is required because of this line. Without safe_serialization=True
, it will only allow loading models with pytorch_model.bin
, and your model is uploaded in the newer model.safetensors
format.
- Tom Aarsen
tomaarsen
changed pull request status to
open
agh thank you for this !
zpn
changed pull request status to
merged
Feel free to test this with:
import torch.nn.functional as F
from sentence_transformers import SentenceTransformer
matryoshka_dim = 512
model = SentenceTransformer("nomic-ai/nomic-embed-text-v1.5", trust_remote_code=True, revision="refs/pr/2")
sentences = ['search_query: What is TSNE?', 'search_query: Who is Laurens van der Maaten?']
embeddings = model.encode(sentences, convert_to_tensor=True)
embeddings = F.layer_norm(embeddings, normalized_shape=(embeddings.shape[1],))
embeddings = embeddings[:, :matryoshka_dim]
embeddings = F.normalize(embeddings, p=2, dim=1)
print(embeddings)
- Tom Aarsen