YuxinJiang
commited on
Commit
·
2181da0
1
Parent(s):
a122445
Update README.md
Browse files
README.md
CHANGED
@@ -2,13 +2,14 @@
|
|
2 |
license: mit
|
3 |
---
|
4 |
# PromCSE: Improved Universal Sentence Embeddings with Prompt-based Contrastive Learning and Energy-based Learning
|
5 |
-
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/
|
|
|
6 |
arXiv link: https://arxiv.org/abs/2203.06875v2
|
7 |
Published in [**EMNLP 2022**](https://2022.emnlp.org/)
|
8 |
|
9 |
-
Our code is modified based on [SimCSE](https://github.com/princeton-nlp/SimCSE) and [P-tuning v2](https://github.com/THUDM/P-tuning-v2/). Here we would like to sincerely thank them for their excellent works.
|
10 |
|
11 |
-
We have released our supervised and unsupervised models on huggingface, which acquire **Top 1** results on 1 domain-shifted STS task and 4 standard STS tasks
|
12 |
|
13 |
[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/deep-continuous-prompt-for-contrastive-1/semantic-textual-similarity-on-cxc)](https://paperswithcode.com/sota/semantic-textual-similarity-on-cxc?p=deep-continuous-prompt-for-contrastive-1)
|
14 |
|
@@ -35,6 +36,7 @@ We have released our supervised and unsupervised models on huggingface, which ac
|
|
35 |
If you have any questions, feel free to raise an issue.
|
36 |
|
37 |
|
|
|
38 |
## Setups
|
39 |
|
40 |
[![Python](https://img.shields.io/badge/python-3.8.2-blue?logo=python&logoColor=FED643)](https://www.python.org/downloads/release/python-382/)
|
@@ -52,7 +54,7 @@ In the following section, we describe how to train a PromCSE model by using our
|
|
52 |
|
53 |
|
54 |
### Evaluation
|
55 |
-
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/
|
56 |
|
57 |
Our evaluation code for sentence embeddings is based on a modified version of [SentEval](https://github.com/facebookresearch/SentEval). It evaluates sentence embeddings on semantic textual similarity (STS) tasks and downstream transfer tasks. For STS tasks, our evaluation takes the "all" setting, and report Spearman's correlation. The STS tasks include seven standard STS tasks (STS12-16, STSB, SICK-R) and one domain-shifted STS task (CxC).
|
58 |
|
@@ -179,7 +181,13 @@ All our experiments are conducted on Nvidia 3090 GPUs.
|
|
179 |
|
180 |
|
181 |
## Usage
|
182 |
-
We provide
|
|
|
|
|
|
|
|
|
|
|
|
|
183 |
```bash
|
184 |
python tool.py \
|
185 |
--model_name_or_path YuxinJiang/unsup-promcse-bert-base-uncased \
|
@@ -271,4 +279,3 @@ Please cite our paper by:
|
|
271 |
pages = "3021--3035",
|
272 |
}
|
273 |
```
|
274 |
-
|
|
|
2 |
license: mit
|
3 |
---
|
4 |
# PromCSE: Improved Universal Sentence Embeddings with Prompt-based Contrastive Learning and Energy-based Learning
|
5 |
+
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1lanXViJzbmGM1bwm8AflNUKmrvDidg_3?usp=sharing)
|
6 |
+
|
7 |
arXiv link: https://arxiv.org/abs/2203.06875v2
|
8 |
Published in [**EMNLP 2022**](https://2022.emnlp.org/)
|
9 |
|
10 |
+
Our code is modified based on [SimCSE](https://github.com/princeton-nlp/SimCSE) and [P-tuning v2](https://github.com/THUDM/P-tuning-v2/). Here we would like to sincerely thank them for their excellent works. Our models acquires comparable results to [PromptBERT](https://github.com/kongds/Prompt-BERT) **without designing discrete prompts manually**.
|
11 |
|
12 |
+
We have released our supervised and unsupervised models on huggingface, which acquire **Top 1** results on 1 domain-shifted STS task and 4 standard STS tasks:
|
13 |
|
14 |
[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/deep-continuous-prompt-for-contrastive-1/semantic-textual-similarity-on-cxc)](https://paperswithcode.com/sota/semantic-textual-similarity-on-cxc?p=deep-continuous-prompt-for-contrastive-1)
|
15 |
|
|
|
36 |
If you have any questions, feel free to raise an issue.
|
37 |
|
38 |
|
39 |
+
|
40 |
## Setups
|
41 |
|
42 |
[![Python](https://img.shields.io/badge/python-3.8.2-blue?logo=python&logoColor=FED643)](https://www.python.org/downloads/release/python-382/)
|
|
|
54 |
|
55 |
|
56 |
### Evaluation
|
57 |
+
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1lanXViJzbmGM1bwm8AflNUKmrvDidg_3?usp=sharing)
|
58 |
|
59 |
Our evaluation code for sentence embeddings is based on a modified version of [SentEval](https://github.com/facebookresearch/SentEval). It evaluates sentence embeddings on semantic textual similarity (STS) tasks and downstream transfer tasks. For STS tasks, our evaluation takes the "all" setting, and report Spearman's correlation. The STS tasks include seven standard STS tasks (STS12-16, STSB, SICK-R) and one domain-shifted STS task (CxC).
|
60 |
|
|
|
181 |
|
182 |
|
183 |
## Usage
|
184 |
+
We provide [tool.py](https://github.com/YJiangcm/PromCSE/blob/master/tool.py) which contains the following functions:
|
185 |
+
|
186 |
+
**(1) encode sentences into embedding vectors;
|
187 |
+
(2) compute cosine simiarities between sentences;
|
188 |
+
(3) given queries, retrieval top-k semantically similar sentences for each query.**
|
189 |
+
|
190 |
+
You can have a try by runing
|
191 |
```bash
|
192 |
python tool.py \
|
193 |
--model_name_or_path YuxinJiang/unsup-promcse-bert-base-uncased \
|
|
|
279 |
pages = "3021--3035",
|
280 |
}
|
281 |
```
|
|