Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
tomaarsen 
posted an update 19 days ago
Post
2266
I just published Sentence Transformers v3.0.1: the first patch release since v3 from last week. It introduces gradient checkpointing, pushing model checkpoints to Hugging Face while training, model card improvements and fixes. Details:

1️⃣ Gradient checkpointing allows for much less memory usage at a cost of ~20% training speed. Seems to allow for higher batch sizes, which is quite important for loss functions with in-batch negatives.
2️⃣ You can specify args.push_to_hub=True and args.hub_model_id to upload your model checkpoints to Hugging Face while training. It also uploads your emissions (if codecarbon is installed) and your Tensorboard logs (if tensorboard is installed)
3️⃣ Model card improvements: improved automatic widget examples, better tags, and the default of "sentence_transformers_model_id" now gets replaced when possible.
4️⃣ Several evaluator fixes, see release notes for details.
5️⃣ Fixed a bug with MatryoshkaLoss throwing an error if the supplied Matryoshka dimensions are ascending instead of descending.
6️⃣ Full Safetensors support; even the uncommon modules can now save and load "model.safetensors" files: no more pickle risks.

Check out the full release notes here: https://github.com/UKPLab/sentence-transformers/releases/tag/v3.0.1

And let me know what kind of features you'd like to see next! I have some plans already (ONNX, Sparse models, ColBERT, PEFT), but I don't yet know how I should prioritize everything.

There is an issue while running on my local Mac machine. When changing the model name from "microsoft/DialoGPT-small" to "meta-llama/Meta-Llama-3-8B", the machine is hanging and no response.

GitHub Link

·

Using an AI code assistant has transformed my development process. It streamlines coding tasks, offers insightful suggestions, and accelerates problem-solving. Truly a game-changer!
https://huggingface.co/