vprelovac/universal-sentence-encoder-multilingual-large-3

This is a part of the MTEB test.

# !pip install tensorflow_text 

import tensorflow_hub as hub
from tensorflow_text import SentencepieceTokenizer
import tensorflow as tf

embedder=hub.load("https://tfhub.dev/google/universal-sentence-encoder-multilingual-large/3")

class USE():
    def encode(self, sentences, batch_size=32, **kwargs):
        embeddings = []
        for i in range(0, len(sentences), batch_size):
            batch_sentences = sentences[i:i+batch_size]
            batch_embeddings = embedder(batch_sentences)
            embeddings.extend(batch_embeddings)
        return embeddings


model = USE()

Spaces using vprelovac/universal-sentence-encoder-multilingual-large-3 2

Evaluation results

accuracy on MTEB AmazonCounterfactualClassification (en)
test set self-reported

70.806
ap on MTEB AmazonCounterfactualClassification (en)
test set self-reported

32.820
f1 on MTEB AmazonCounterfactualClassification (en)
test set self-reported

64.532
accuracy on MTEB AmazonPolarityClassification
test set self-reported

67.045
ap on MTEB AmazonPolarityClassification
test set self-reported

61.734
f1 on MTEB AmazonPolarityClassification
test set self-reported

66.662
accuracy on MTEB AmazonReviewsClassification (en)
test set self-reported

35.850
f1 on MTEB AmazonReviewsClassification (en)
test set self-reported

35.332
v_measure on MTEB ArxivClusteringP2P
test set self-reported

34.745
v_measure on MTEB ArxivClusteringS2S
test set self-reported

22.621

View on Papers With Code