guac
commited on
Commit
•
7664ba3
1
Parent(s):
04653c1
[fix] Make `Training Data` section header-2
Browse files
README.md
CHANGED
@@ -55,7 +55,7 @@ for sentence, score in sentence_score_pairs:
|
|
55 |
```
|
56 |
|
57 |
|
58 |
-
|
59 |
|
60 |
Carptriever-1 is pre-trained on a de-duplicated subset of [The Pile](https://pile.eleuther.ai/), a large and diverse dataset created by EleutherAI for language model training. This subset was created through a [Minhash LSH](http://ekzhu.com/datasketch/lsh.html) process using a threshold of `0.87`.
|
61 |
|
|
|
55 |
```
|
56 |
|
57 |
|
58 |
+
## Training data
|
59 |
|
60 |
Carptriever-1 is pre-trained on a de-duplicated subset of [The Pile](https://pile.eleuther.ai/), a large and diverse dataset created by EleutherAI for language model training. This subset was created through a [Minhash LSH](http://ekzhu.com/datasketch/lsh.html) process using a threshold of `0.87`.
|
61 |
|