Muennighoff
commited on
Commit
•
8e8ddc4
1
Parent(s):
424904b
Update README.md
Browse files
README.md
CHANGED
@@ -6,38 +6,19 @@ tags:
|
|
6 |
- sentence-similarity
|
7 |
---
|
8 |
|
9 |
-
# {MODEL_NAME}
|
10 |
-
|
11 |
-
This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 4096 dimensional dense vector space and can be used for tasks like clustering or semantic search.
|
12 |
-
|
13 |
-
<!--- Describe your model here -->
|
14 |
-
|
15 |
## Usage (Sentence-Transformers)
|
16 |
|
17 |
-
|
18 |
-
|
19 |
-
```
|
20 |
-
pip install -U sentence-transformers
|
21 |
-
```
|
22 |
-
|
23 |
-
Then you can use the model like this:
|
24 |
-
|
25 |
-
```python
|
26 |
-
from sentence_transformers import SentenceTransformer
|
27 |
-
sentences = ["This is an example sentence", "Each sentence is converted"]
|
28 |
|
29 |
-
model
|
30 |
-
|
31 |
-
|
32 |
```
|
33 |
|
34 |
-
|
35 |
-
|
36 |
## Evaluation Results
|
37 |
|
38 |
-
<!--- Describe how your model was evaluated -->
|
39 |
|
40 |
-
|
41 |
|
42 |
|
43 |
## Training
|
@@ -50,6 +31,8 @@ The model was trained with the parameters:
|
|
50 |
{'batch_size': 32, 'sampler': 'torch.utils.data.sampler.RandomSampler', 'batch_sampler': 'torch.utils.data.sampler.BatchSampler'}
|
51 |
```
|
52 |
|
|
|
|
|
53 |
**Loss**:
|
54 |
|
55 |
`sentence_transformers.losses.MultipleNegativesRankingLoss.MNRLGradCache`
|
@@ -83,4 +66,11 @@ SentenceTransformer(
|
|
83 |
|
84 |
## Citing & Authors
|
85 |
|
86 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
6 |
- sentence-similarity
|
7 |
---
|
8 |
|
|
|
|
|
|
|
|
|
|
|
|
|
9 |
## Usage (Sentence-Transformers)
|
10 |
|
11 |
+
For usage instructions, refer to: https://github.com/Muennighoff/sgpt#asymmetric-semantic-search
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
12 |
|
13 |
+
The model was trained with the command
|
14 |
+
```bash
|
15 |
+
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 accelerate launch examples/training/ms_marco/train_bi-encoder_mnrl.py --model_name bigscience/bloom-7b1 --train_batch_size 32 --eval_batch_size 16 --freezenonbias --specb --lr 4e-4 --wandb --wandbwatchlog gradients --pooling weightedmean --gradcache --chunksize 8
|
16 |
```
|
17 |
|
|
|
|
|
18 |
## Evaluation Results
|
19 |
|
|
|
20 |
|
21 |
+
`{"ndcgs": {"sgpt-bloom-7b1-msmarco": {"scifact": {"NDCG@10": 0.71824}, "nfcorpus": {"NDCG@10": 0.35748}, "arguana": {"NDCG@10": 0.47281}, "scidocs": {"NDCG@10": 0.18435}, "fiqa": {"NDCG@10": 0.35736}, "cqadupstack": {"NDCG@10": 0.3708525}, "quora": {"NDCG@10": 0.74655}, "trec-covid": {"NDCG@10": 0.82731}, "webis-touche2020": {"NDCG@10": 0.2365}}}`
|
22 |
|
23 |
|
24 |
## Training
|
|
|
31 |
{'batch_size': 32, 'sampler': 'torch.utils.data.sampler.RandomSampler', 'batch_sampler': 'torch.utils.data.sampler.BatchSampler'}
|
32 |
```
|
33 |
|
34 |
+
The model uses BitFit, weighted-mean pooling & GradCache, for details see: https://arxiv.org/abs/2202.08904
|
35 |
+
|
36 |
**Loss**:
|
37 |
|
38 |
`sentence_transformers.losses.MultipleNegativesRankingLoss.MNRLGradCache`
|
|
|
66 |
|
67 |
## Citing & Authors
|
68 |
|
69 |
+
```bibtex
|
70 |
+
@article{muennighoff2022sgpt,
|
71 |
+
title={SGPT: GPT Sentence Embeddings for Semantic Search},
|
72 |
+
author={Muennighoff, Niklas},
|
73 |
+
journal={arXiv preprint arXiv:2202.08904},
|
74 |
+
year={2022}
|
75 |
+
}
|
76 |
+
```
|