Edit model card

Быстрая модель BERT для расчетов эмбедингов предложений на русском языке. Модель основана на cointegrated/rubert-tiny2 - имеет аналогичные размеры контекста (2048), ембединга (312) и быстродействие.

Использование

from sentence_transformers import SentenceTransformer, util

model = SentenceTransformer('sergeyzh/rubert-tiny-turbo')

sentences = ["привет мир", "hello world", "здравствуй вселенная"]
embeddings = model.encode(sentences)
print(util.dot_score(embeddings, embeddings))

Метрики

Оценки модели на бенчмарке encodechka:

model CPU GPU size Mean S Mean S+W dim
BAAI/bge-m3 523.40 22.50 2166 0.787 0.696 1024
intfloat/multilingual-e5-large 506.80 30.80 2136 0.780 0.686 1024
intfloat/multilingual-e5-base 130.61 14.39 1061 0.761 0.669 768
sergeyzh/rubert-tiny-turbo 5.51 3.25 111 0.749 0.667 312
intfloat/multilingual-e5-small 40.86 12.09 449 0.742 0.645 384
cointegrated/rubert-tiny2 5.51 3.25 111 0.704 0.638 312
model STS PI NLI SA TI IA IC ICX NE1 NE2
BAAI/bge-m3 0.864 0.749 0.510 0.819 0.973 0.792 0.809 0.783 0.240 0.422
intfloat/multilingual-e5-large 0.862 0.727 0.473 0.810 0.979 0.798 0.819 0.773 0.224 0.374
intfloat/multilingual-e5-base 0.835 0.704 0.459 0.796 0.964 0.783 0.802 0.738 0.235 0.376
sergeyzh/rubert-tiny-turbo 0.828 0.722 0.476 0.787 0.955 0.757 0.780 0.685 0.305 0.373
intfloat/multilingual-e5-small 0.822 0.714 0.457 0.758 0.957 0.761 0.779 0.691 0.234 0.275
cointegrated/rubert-tiny2 0.750 0.651 0.417 0.737 0.937 0.746 0.757 0.638 0.360 0.386
Downloads last month
839
Safetensors
Model size
29.2M params
Tensor type
F32
·
Inference API
This model can be loaded on Inference API (serverless).

Finetuned from

Datasets used to train sergeyzh/rubert-tiny-turbo