Moritz Laurer's picture

Moritz Laurer

MoritzLaurer

AI & ML interests

None yet

Recent Activity

updated a collection about 23 hours ago
prompt-templates
updated a collection about 23 hours ago
prompt-templates
updated a collection about 23 hours ago
prompt-templates
View all activity

Articles

Organizations

Hugging Face's profile picture Amazon SageMaker Community's profile picture  Zero Shot NLI 's profile picture Hugging Test Lab's profile picture Deutsche Gesellschaft für internationale Zusammenarbeit's profile picture HuggingFaceM4's profile picture Aledade Inc's profile picture classroom-test-room's profile picture Prezi's profile picture Blog-explorers's profile picture Enterprise Explorers's profile picture ZeroGPU Explorers's profile picture Spectral's profile picture C&A's profile picture Social Post Explorers's profile picture Triple's profile picture Dev Mode Explorers's profile picture moritz-test-organization-changed-2's profile picture Hugging Face Discord Community's profile picture Moritz Test Org's profile picture

MoritzLaurer's activity

upvoted an article 9 days ago
posted an update 14 days ago
view post
Post
2483
Quite excited by the ModernBERT release! 0.15/0.4B small, 2T modern pre-training data and tokenizer with code, 8k context window, great efficient model for embeddings & classification!

This will probably be the basis for many future SOTA encoders! And I can finally stop using DeBERTav3 from 2021 :D

Congrats @answerdotai , @LightOnIO and collaborators like @tomaarsen !

Paper and models here 👇https://huggingface.co/collections/answerdotai/modernbert-67627ad707a4acbf33c41deb
  • 1 reply
·
replied to their post 16 days ago
view reply

Hey @borowis , I don't think there is a plan to add embedding models to the NIM API. Embedding models are quite small which makes them easier to run on accessible hardware (vs. the H100 GPUs running the large LLMs on the NIM API). I'd recommend using a cheap GPU (or even a CPU) via the HF dedicated endpoints for deploying embedding models: https://huggingface.co/inference-endpoints/dedicated And you can use the autoscaling/scale-to-zero feature to avoid unnecessary costs
(The smaller BGE models from the MTEB leaderboard are always a good place to start)