Nomic AI

Enterprise

company

Verified

https://nomic.ai

nomic_ai

nomic-ai

Activity Feed

AI & ML interests

embeddings, graph statistics, nlp

Recent Activity

zpn new activity 1 day ago

nomic-ai/modernbert-embed-base:Reproducible training script somewhere?

zpn updated a model 4 days ago

nomic-ai/modernbert-embed-base

zpn new activity 4 days ago

nomic-ai/modernbert-embed-base:Upload ONNX weights

View all activity

nomic-ai's activity

zpn

in nomic-ai/modernbert-embed-base 1 day ago

Reproducible training script somewhere?

#4 opened 1 day ago by

Jesse-marqo

tomaarsen

posted an update 3 days ago

Post

2282

That didn't take long! Nomic AI has finetuned the new ModernBERT-base encoder model into a strong embedding model for search, classification, clustering and more!

Details:
🤖 Based on ModernBERT-base with 149M parameters.
📊 Outperforms both nomic-embed-text-v1 and nomic-embed-text-v1.5 on MTEB!
🏎️ Immediate FA2 and unpacking support for super efficient inference.
🪆 Trained with Matryoshka support, i.e. 2 valid output dimensionalities: 768 and 256.
➡️ Maximum sequence length of 8192 tokens!
2️⃣ Trained in 2 stages: unsupervised contrastive data -> high quality labeled datasets.
➕ Integrated in Sentence Transformers, Transformers, LangChain, LlamaIndex, Haystack, etc.
🏛️ Apache 2.0 licensed: fully commercially permissible

Try it out here: nomic-ai/modernbert-embed-base

Very nice work by Zach Nussbaum and colleagues at Nomic AI.

zpn

updated a model 4 days ago

nomic-ai/modernbert-embed-base

zpn

in nomic-ai/modernbert-embed-base 4 days ago

Upload ONNX weights

#3 opened 4 days ago by

Xenova

Fix typo; update README script + specific MRL snippets

#2 opened 4 days ago by

tomaarsen

in nomic-ai/modernbert-embed-base 4 days ago

Fix typo; update README script + specific MRL snippets

#2 opened 4 days ago by

tomaarsen

updated a model 4 days ago

nomic-ai/modernbert-embed-base

zpn

in nomic-ai/modernbert-embed-base 4 days ago

Add new SentenceTransformer model

#1 opened 4 days ago by

zpn

updated a model 4 days ago

nomic-ai/modernbert-embed-base-unsupervised

tomaarsen

authored a paper 15 days ago

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published 16 days ago • 116

jxm

in nomic-ai/nomic-bert-2048 16 days ago

the nomic embedding model fails with error `configuration_hf_nomic_bert' has no attribute 'NomicBertConfig'`

#19 opened 16 days ago by

lbwavebo-uber

jxm

updated a model 16 days ago

nomic-ai/nomic-bert-2048

Fill-Mask • Updated 16 days ago • 24.1k • 30

zpn

updated a model about 1 month ago

nomic-ai/nomic-bert-2048

Fill-Mask • Updated 16 days ago • 24.1k • 30

zpn

in nomic-ai/nomic-bert-2048 about 2 months ago

Update modeling_hf_nomic_bert.py

#17 opened about 2 months ago by

zpn

Update modeling_hf_nomic_bert.py

#14 opened about 2 months ago by

zpn

Update modeling_hf_nomic_bert.py

#15 opened about 2 months ago by

zpn

Update modeling_hf_nomic_bert.py

#16 opened about 2 months ago by

zpn

Update modeling_hf_nomic_bert.py

#13 opened about 2 months ago by

zpn

Update modeling_hf_nomic_bert.py

#12 opened about 2 months ago by

zpn

tomaarsen

posted an update about 2 months ago

Post

5425

I just released Sentence Transformers v3.3.0 & it's huge! 4.5x speedup for CPU with OpenVINO int8 static quantization, training with prompts for a free perf. boost, PEFT integration, evaluation on NanoBEIR, and more! Details:

1. We integrate Post-Training Static Quantization using OpenVINO, a very efficient solution for CPUs that processes 4.78x as many texts per second on average, while only hurting performance by 0.36% on average. There's a new export_static_quantized_openvino_model method to quantize a model.

2. We add the option to train with prompts, e.g. strings like "query: ", "search_document: " or "Represent this sentence for searching relevant passages: ". It's as simple as using the prompts argument in SentenceTransformerTrainingArguments. Our experiments show that you can easily reach 0.66% to 0.90% relative performance improvement on NDCG@10 at no extra cost by adding "query: " before each training query and "document: " before each training answer.

3. Sentence Transformers now supports training PEFT adapters via 7 new methods for adding new adapters or loading pre-trained ones. You can also directly load a trained adapter with SentenceTransformer as if it's a normal model. Very useful for e.g. 1) training multiple adapters on 1 base model, 2) training bigger models than otherwise possible, or 3) cheaply hosting multiple models by switching multiple adapters on 1 base model.

4. We added easy evaluation on NanoBEIR, a subset of BEIR a.k.a. the MTEB Retrieval benchmark. It contains 13 datasets with 50 queries and up to 10k documents each. Evaluation is fast, and can easily be done during training to track your model's performance on general-purpose information retrieval tasks.

Additionally, we also deprecate Python 3.8, add better compatibility with Transformers v4.46.0, and more. Read the full release notes here: https://github.com/UKPLab/sentence-transformers/releases/tag/v3.3.0

AI & ML interests

Recent Activity

Team members 18

nomic-ai's activity

Reproducible training script somewhere?

Upload ONNX weights

Fix typo; update README script + specific MRL snippets

Fix typo; update README script + specific MRL snippets

Add new SentenceTransformer model

the nomic embedding model fails with error `configuration_hf_nomic_bert' has no attribute 'NomicBertConfig'`

Update modeling_hf_nomic_bert.py

Update modeling_hf_nomic_bert.py

Update modeling_hf_nomic_bert.py

Update modeling_hf_nomic_bert.py

Update modeling_hf_nomic_bert.py

Update modeling_hf_nomic_bert.py