TJKlein commited on
Commit
6b91936
1 Parent(s): 445e83e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -18,7 +18,7 @@ The model intended to be used for encoding sentences or short paragraphs. Given
18
 
19
  # Training data
20
 
21
- The model was trained on a random collection of **English** sentences from Wikipedia. The *full-shot* training file is available [here].(https://huggingface.co/datasets/princeton-nlp/datasets-for-simcse/resolve/main/wiki1m_for_simcse.txt)
22
  Low-shot training data consists of data splits of different sizes (from 10% to 0.0064%) of the [SimCSE](https://github.com/princeton-nlp/SimCSE) training corpus. Each split size comprises 5 files, created with a different seed indicated with filename postfix.
23
  Data can be downloaded [here](https://huggingface.co/datasets/sap-ai-research/datasets-for-micse).
24
 
 
18
 
19
  # Training data
20
 
21
+ The model was trained on a random collection of **English** sentences from Wikipedia. The *full-shot* training file is available [here](https://huggingface.co/datasets/princeton-nlp/datasets-for-simcse/resolve/main/wiki1m_for_simcse.txt).
22
  Low-shot training data consists of data splits of different sizes (from 10% to 0.0064%) of the [SimCSE](https://github.com/princeton-nlp/SimCSE) training corpus. Each split size comprises 5 files, created with a different seed indicated with filename postfix.
23
  Data can be downloaded [here](https://huggingface.co/datasets/sap-ai-research/datasets-for-micse).
24