Update README.md
Browse files
README.md
CHANGED
@@ -2608,7 +2608,7 @@ model-index:
|
|
2608 |
|
2609 |
# gte-base-en-v1.5
|
2610 |
|
2611 |
-
We introduce `gte-v1.5` series, upgraded `gte` embeddings that support the context length of up to **8192
|
2612 |
The models are built upon the `transformer++` encoder [backbone](https://huggingface.co/Alibaba-NLP/new-impl) (BERT + RoPE + GLU).
|
2613 |
|
2614 |
The `gte-v1.5` series achieve state-of-the-art scores on the MTEB benchmark within the same model size category and prodvide competitive on the LoCo long-context retrieval tests (refer to [Evaluation](#evaluation)).
|
@@ -2689,8 +2689,8 @@ print(cos_sim(embeddings[0], embeddings[1]))
|
|
2689 |
### Training Data
|
2690 |
|
2691 |
- Masked language modeling (MLM): `c4-en`
|
2692 |
-
- Weak-supervised contrastive (WSC) pre-training: GTE pre-training data
|
2693 |
-
- Supervised contrastive fine-tuning: GTE fine-tuning data
|
2694 |
|
2695 |
### Training Procedure
|
2696 |
|
@@ -2734,14 +2734,16 @@ The gte evaluation setting: `mteb==1.2.0, fp16 auto mix precision, max_length=81
|
|
2734 |
|
2735 |
|
2736 |
|
2737 |
-
## Citation
|
|
|
2738 |
|
2739 |
-
|
2740 |
-
|
2741 |
-
|
2742 |
-
|
2743 |
-
|
|
|
|
|
|
|
2744 |
|
2745 |
-
**APA:**
|
2746 |
|
2747 |
-
[More Information Needed]
|
|
|
2608 |
|
2609 |
# gte-base-en-v1.5
|
2610 |
|
2611 |
+
We introduce `gte-v1.5` series, upgraded `gte` embeddings that support the context length of up to **8192**,while further enhancing model performance.
|
2612 |
The models are built upon the `transformer++` encoder [backbone](https://huggingface.co/Alibaba-NLP/new-impl) (BERT + RoPE + GLU).
|
2613 |
|
2614 |
The `gte-v1.5` series achieve state-of-the-art scores on the MTEB benchmark within the same model size category and prodvide competitive on the LoCo long-context retrieval tests (refer to [Evaluation](#evaluation)).
|
|
|
2689 |
### Training Data
|
2690 |
|
2691 |
- Masked language modeling (MLM): `c4-en`
|
2692 |
+
- Weak-supervised contrastive (WSC) pre-training: [GTE](https://arxiv.org/pdf/2308.03281.pdf) pre-training data
|
2693 |
+
- Supervised contrastive fine-tuning: [GTE](https://arxiv.org/pdf/2308.03281.pdf) fine-tuning data
|
2694 |
|
2695 |
### Training Procedure
|
2696 |
|
|
|
2734 |
|
2735 |
|
2736 |
|
2737 |
+
## Citation
|
2738 |
+
If you find our paper or models helpful, please consider citing them as follows:
|
2739 |
|
2740 |
+
```
|
2741 |
+
@article{li2023towards,
|
2742 |
+
title={Towards general text embeddings with multi-stage contrastive learning},
|
2743 |
+
author={Li, Zehan and Zhang, Xin and Zhang, Yanzhao and Long, Dingkun and Xie, Pengjun and Zhang, Meishan},
|
2744 |
+
journal={arXiv preprint arXiv:2308.03281},
|
2745 |
+
year={2023}
|
2746 |
+
}
|
2747 |
+
```
|
2748 |
|
|
|
2749 |
|
|