vprelovac/universal-sentence-encoder-4

This is a part of the MTEB test.

# !pip install tensorflow_text 

import tensorflow_hub as hub
from tensorflow_text import SentencepieceTokenizer
import tensorflow as tf

embedder=hub.load("https://tfhub.dev/google/universal-sentence-encoder-multilingual-large/3")

class USE():
    def encode(self, sentences, batch_size=32, **kwargs):
        embeddings = []
        for i in range(0, len(sentences), batch_size):
            batch_sentences = sentences[i:i+batch_size]
            batch_embeddings = embedder(batch_sentences)
            embeddings.extend(batch_embeddings)
        return embeddings


model = USE()

Spaces using vprelovac/universal-sentence-encoder-4 2

Evaluation results

accuracy on MTEB AmazonCounterfactualClassification (en)
test set self-reported

70.672
ap on MTEB AmazonCounterfactualClassification (en)
test set self-reported

32.835
f1 on MTEB AmazonCounterfactualClassification (en)
test set self-reported

64.427
accuracy on MTEB AmazonPolarityClassification
test set self-reported

67.732
ap on MTEB AmazonPolarityClassification
test set self-reported

62.475
f1 on MTEB AmazonPolarityClassification
test set self-reported

67.486
accuracy on MTEB AmazonReviewsClassification (en)
test set self-reported

32.620
f1 on MTEB AmazonReviewsClassification (en)
test set self-reported

32.135
v_measure on MTEB ArxivClusteringP2P
test set self-reported

35.126
v_measure on MTEB ArxivClusteringS2S
test set self-reported

23.457

View on Papers With Code