iwiwi commited on
Commit
420a5e6
1 Parent(s): ad9db52

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -68,7 +68,7 @@ Around 100B tokens from a mixture of the following corpora were used for the con
68
  - [Japanese mc4](https://huggingface.co/datasets/mc4)
69
  - [Japanese CC-100](http://data.statmt.org/cc-100/ja.txt.xz)
70
  - [Japanese OSCAR](https://oscar-project.github.io/documentation/)
71
- - [SlimPajama](https://huggingface.co/datasets/cerebras/SlimPajama-627B)
72
 
73
 
74
  ## Use and Limitations
 
68
  - [Japanese mc4](https://huggingface.co/datasets/mc4)
69
  - [Japanese CC-100](http://data.statmt.org/cc-100/ja.txt.xz)
70
  - [Japanese OSCAR](https://oscar-project.github.io/documentation/)
71
+ - [SlimPajama](https://huggingface.co/datasets/cerebras/SlimPajama-627B) without the Books3 subset
72
 
73
 
74
  ## Use and Limitations