SentenceTransformer based on intfloat/e5-base-unsupervised

This is a sentence-transformers model finetuned from intfloat/e5-base-unsupervised. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: intfloat/e5-base-unsupervised
Maximum Sequence Length: 512 tokens
Output Dimensionality: 768 tokens
Similarity Function: Cosine Similarity

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("bobox/E5-base-unsupervised-TSDAE-2")
# Run inference
sentences = [
    'ligand ion channels located?',
    'where are ligand gated ion channels located?',
    "Duvets tend to be warm but surprisingly lightweight. The duvet cover makes it easier to change bedding looks and styles. You won't need to wash your duvet very often, just wash the cover regularly. Additionally, duvets tend to be fluffier than comforters, and can simplify bed making if you choose the European style.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Dataset: sts-test
Evaluated with EmbeddingSimilarityEvaluator

Metric	Value
pearson_cosine	0.7652
spearman_cosine	0.7525
pearson_manhattan	0.7393
spearman_manhattan	0.7326
pearson_euclidean	0.7402
spearman_euclidean	0.7335
pearson_dot	0.5003
spearman_dot	0.4986
pearson_max	0.7652
spearman_max	0.7525

Training Details

Training Dataset

Unnamed Dataset

Size: 700,000 training samples
Columns: sentence_0 and sentence_1
Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1
type string string
details
min: 3 tokens
mean: 15.73 tokens
max: 55 tokens

min: 8 tokens
mean: 36.05 tokens
max: 131 tokens

	sentence_0	sentence_1
type	string	string
details	min: 3 tokens mean: 15.73 tokens max: 55 tokens	min: 8 tokens mean: 36.05 tokens max: 131 tokens

Samples:

sentence_0	sentence_1
`Quality such a has components with applicable high objective system measure component improvements`	`Quality in such a system has three components: high accuracy, compliance with applicable standards, and high customer satisfaction. The objective of the system is to measure each component and achieve improvements.`
`include`	`does qbi include capital gains?`
`They have a . parietal is in, as becomes and pigments after four to is believed and in circadian cycles`	`They have a third eye. The parietal eye is only visible in hatchlings, as it becomes covered in scales and pigments after four to six months. Its function is a subject of ongoing research, but it is believed to be useful in absorbing ultraviolet rays and in setting circadian and seasonal cycles.`

Loss: DenoisingAutoEncoderLoss

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: steps
per_device_train_batch_size: 16
per_device_eval_batch_size: 16
num_train_epochs: 2
multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: steps
prediction_loss_only: True
per_device_train_batch_size: 16
per_device_eval_batch_size: 16
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 1
eval_accumulation_steps: None
learning_rate: 5e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1
num_train_epochs: 2
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: {}
warmup_ratio: 0.0
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: False
fp16: False
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: False
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: False
hub_always_push: False
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
dispatch_batches: None
split_batches: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
batch_sampler: batch_sampler
multi_dataset_batch_sampler: round_robin

Training Logs

Click to expand

Epoch	Step	Training Loss	sts-test_spearman_cosine
0	0	-	0.7211
0.0114	500	9.4957	-
0.0229	1000	7.4063	-
0.0343	1500	7.0225	-
0.0457	2000	6.6991	-
0.0571	2500	6.4054	-
0.0686	3000	6.1933	-
0.08	3500	5.999	-
0.0914	4000	5.8471	-
0.1	4375	-	0.4610
0.1029	4500	5.6876	-
0.1143	5000	5.5934	-
0.1257	5500	5.4877	-
0.1371	6000	5.4034	-
0.1486	6500	5.3016	-
0.16	7000	5.2169	-
0.1714	7500	5.1351	-
0.1829	8000	5.0605	-
0.1943	8500	4.9851	-
0.2	8750	-	0.6490
0.2057	9000	4.9024	-
0.2171	9500	4.8722	-
0.2286	10000	4.7955	-
0.24	10500	4.7435	-
0.2514	11000	4.6742	-
0.2629	11500	4.6447	-
0.2743	12000	4.5964	-
0.2857	12500	4.5186	-
0.2971	13000	4.5024	-
0.3	13125	-	0.7121
0.3086	13500	4.4336	-
0.32	14000	4.3767	-
0.3314	14500	4.3454	-
0.3429	15000	4.3067	-
0.3543	15500	4.2627	-
0.3657	16000	4.2323	-
0.3771	16500	4.208	-
0.3886	17000	4.1622	-
0.4	17500	4.113	0.7375
0.4114	18000	4.1097	-
0.4229	18500	4.0666	-
0.4343	19000	4.0311	-
0.4457	19500	4.0241	-
0.4571	20000	3.9991	-
0.4686	20500	3.9873	-
0.48	21000	3.9439	-
0.4914	21500	3.9281	-
0.5	21875	-	0.7502
0.5029	22000	3.9047	-
0.5143	22500	3.89	-
0.5257	23000	3.8671	-
0.5371	23500	3.85	-
0.5486	24000	3.8336	-
0.56	24500	3.8081	-
0.5714	25000	3.8049	-
0.5829	25500	3.7587	-
0.5943	26000	3.769	-
0.6	26250	-	0.7530
0.6057	26500	3.7488	-
0.6171	27000	3.7218	-
0.6286	27500	3.7128	-
0.64	28000	3.7104	-
0.6514	28500	3.6706	-
0.6629	29000	3.6602	-
0.6743	29500	3.658	-
0.6857	30000	3.665	-
0.6971	30500	3.6439	-
0.7	30625	-	0.7561
0.7086	31000	3.6411	-
0.72	31500	3.6141	-
0.7314	32000	3.6172	-
0.7429	32500	3.5975	-
0.7543	33000	3.5827	-
0.7657	33500	3.5836	-
0.7771	34000	3.5484	-
0.7886	34500	3.5275	-
0.8	35000	3.5587	0.7553
0.8114	35500	3.5371	-
0.8229	36000	3.5334	-
0.8343	36500	3.5168	-
0.8457	37000	3.483	-
0.8571	37500	3.4755	-
0.8686	38000	3.4943	-
0.88	38500	3.4699	-
0.8914	39000	3.4732	-
0.9	39375	-	0.7560
0.9029	39500	3.4572	-
0.9143	40000	3.4518	-
0.9257	40500	3.4298	-
0.9371	41000	3.4215	-
0.9486	41500	3.4176	-
0.96	42000	3.4353	-
0.9714	42500	3.4137	-
0.9829	43000	3.4037	-
0.9943	43500	3.4157	-
1.0	43750	-	0.7554
1.0057	44000	3.393	-
1.0171	44500	3.4092	-
1.0286	45000	3.3861	-
1.04	45500	3.3976	-
1.0514	46000	3.3769	-
1.0629	46500	3.3444	-
1.0743	47000	3.3598	-
1.0857	47500	3.3556	-
1.0971	48000	3.3548	-
1.1	48125	-	0.7549
1.1086	48500	3.3278	-
1.12	49000	3.3309	-
1.1314	49500	3.3459	-
1.1429	50000	3.3353	-
1.1543	50500	3.3192	-
1.1657	51000	3.3022	-
1.1771	51500	3.3189	-
1.1886	52000	3.301	-
1.2	52500	3.2785	0.7542
1.2114	53000	3.2996	-
1.2229	53500	3.2863	-
1.2343	54000	3.2916	-
1.2457	54500	3.272	-
1.2571	55000	3.2896	-
1.2686	55500	3.2694	-
1.28	56000	3.2848	-
1.2914	56500	3.2528	-
1.3	56875	-	0.7554
1.3029	57000	3.2622	-
1.3143	57500	3.2515	-
1.3257	58000	3.2385	-
1.3371	58500	3.2341	-
1.3486	59000	3.2275	-
1.3600	59500	3.2538	-
1.3714	60000	3.2329	-
1.3829	60500	3.2322	-
1.3943	61000	3.2039	-
1.4	61250	-	0.7530
1.4057	61500	3.212	-
1.4171	62000	3.2127	-
1.4286	62500	3.1956	-
1.44	63000	3.202	-
1.4514	63500	3.2046	-
1.4629	64000	3.2105	-
1.4743	64500	3.1915	-
1.4857	65000	3.176	-
1.4971	65500	3.1852	-
1.5	65625	-	0.7541
1.5086	66000	3.1988	-
1.52	66500	3.1714	-
1.5314	67000	3.1816	-
1.5429	67500	3.1745	-
1.5543	68000	3.1674	-
1.5657	68500	3.1887	-
1.5771	69000	3.1567	-
1.5886	69500	3.1775	-
1.6	70000	3.1696	0.7535
1.6114	70500	3.154	-
1.6229	71000	3.1553	-
1.6343	71500	3.1675	-
1.6457	72000	3.1516	-
1.6571	72500	3.1569	-
1.6686	73000	3.1403	-
1.6800	73500	3.1667	-
1.6914	74000	3.1545	-
1.7	74375	-	0.7529
1.7029	74500	3.1736	-
1.7143	75000	3.1447	-
1.7257	75500	3.1567	-
1.7371	76000	3.1682	-
1.7486	76500	3.149	-
1.76	77000	3.1522	-
1.7714	77500	3.1412	-
1.7829	78000	3.1268	-
1.7943	78500	3.1476	-
1.8	78750	-	0.7524
1.8057	79000	3.1669	-
1.8171	79500	3.1432	-
1.8286	80000	3.1603	-
1.8400	80500	3.1347	-
1.8514	81000	3.1209	-
1.8629	81500	3.1302	-
1.8743	82000	3.1423	-
1.8857	82500	3.1481	-
1.8971	83000	3.1262	-
1.9	83125	-	0.7525
1.9086	83500	3.1484	-
1.92	84000	3.1331	-
1.9314	84500	3.122	-
1.9429	85000	3.1272	-
1.9543	85500	3.1435	-
1.9657	86000	3.1431	-
1.9771	86500	3.1457	-
1.9886	87000	3.1286	-
2.0	87500	3.1352	0.7525

Framework Versions

Python: 3.10.13
Sentence Transformers: 3.0.1
Transformers: 4.41.2
PyTorch: 2.1.2
Accelerate: 0.31.0
Datasets: 2.19.2
Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

DenoisingAutoEncoderLoss

@inproceedings{wang-2021-TSDAE,
    title = "TSDAE: Using Transformer-based Sequential Denoising Auto-Encoderfor Unsupervised Sentence Embedding Learning",
    author = "Wang, Kexin and Reimers, Nils and Gurevych, Iryna", 
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2021",
    month = nov,
    year = "2021",
    address = "Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    pages = "671--688",
    url = "https://arxiv.org/abs/2104.06979",
}

bobox
/

E5-base-unsupervised-TSDAE-2

SentenceTransformer based on intfloat/e5-base-unsupervised

Model Details

Model Description

Model Sources

Full Model Architecture

Usage

Direct Usage (Sentence Transformers)

Evaluation

Metrics

Semantic Similarity

Training Details

Training Dataset

Unnamed Dataset

Training Hyperparameters

Non-Default Hyperparameters

All Hyperparameters

Training Logs

Framework Versions

Citation

BibTeX

Sentence Transformers

DenoisingAutoEncoderLoss

Finetuned from

Evaluation results

SentenceTransformer based on intfloat/e5-base-unsupervised

Model Details

Model Description

Model Sources

Full Model Architecture

Usage

Direct Usage (Sentence Transformers)

Evaluation

Metrics

Semantic Similarity

Training Details

Training Dataset

Unnamed Dataset

Training Hyperparameters

Non-Default Hyperparameters

All Hyperparameters

Training Logs

Framework Versions

Citation

BibTeX

Sentence Transformers

DenoisingAutoEncoderLoss

Finetuned from intfloat/e5-base-unsupervised

Evaluation results

Finetuned from